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Abstract 

Generic programming is an effective methodology for developing reusable software libraries. Many pro- 
gramming languages provide generics and have features for describing interfaces, but none completely 
support the idioms used in generic programming. To address this need we developed the language Q. The 
central feature of Q is the concept, a mechanism for organizing constraints on generics that is inspired 
by the needs of modern C++ libraries. Q provides modular type checking and separate compilation (even of 
generics). These characteristics support modular software development, especially the smooth integration 
of independently developed components. In this article we present the rationale for the design of Q and 
demonstrate the expressiveness of Q with two case studies: porting the Standard Template Library and the 
Boost Graph Library from C++ to Q. The design of Q shares much in common with the concept extension 
proposed for the next C++ Standard (the authors participated in its design) but there are important differences 
described in this article. 

Key words: programming language design, generic programming, generics, polymorphism, concepts, associated types, 
software reuse, type classes, modules, signatures, functors, virtual types 



1. Introduction 

The 1968 NATO Conference on Software Engineering identified a software crisis affect- 
ing large systems such as IBM's OS/360 and the SABRE airline reservation system [1, 2]. At 
this conference Mcllroy gave an invited talk entitled Mass-produced Software Components [3] 
proposing the systematic creation of reusable software components as a solution to the software 
crisis. He observed that most software is created from similar building blocks, so programmer 
productivity would be increased if a standard set of blocks could be shared among many soft- 
ware products. We are beginning to see the benefits of software reuse; Douglas Mcllroy's vision 
is gradually becoming a reality. The number of commercial and open source software libraries 
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is steadily growing and application builders often turn to libraries for user-interface components, 
database access, report creation, numerical routines, and network communication, to name a few. 
Furthermore, larger software companies have benefited from the creation of in-house domain- 
specific libraries which they use to support entire software product lines [4]. 

As the field of software engineering progresses, we learn better techniques for building reusable 
software. In the 1980s Musser and Stepanov developed a methodology for creating highly reusable 
algorithm libraries [5, 6, 7, 8], using the term generic programming for their work. 1 Their ap- 
proach was novel in that they wrote algorithms not in terms of particular data structures but rather 
in terms of abstract requirements on structures based on the needs of the algorithm. Such generic 
algorithms could operate on any data structure provided that it meet the specified requirements. 
Preliminary versions of their generic algorithms were implemented in Scheme, Ada, and C. In 
the early 1990s Stepanov and Musser took advantage of the template feature of C++ [9] to con- 
struct the Standard Template Library (STL) [10, 11]. The STL became part of the C++ Standard, 
which brought generic programming into the mainstream. Since then, the methodology has been 
successfully applied to the creation of libraries in numerous domains [12, 13, 14, 15, 16]. 

The ease with which programmers develop and use generic libraries varies greatly depend- 
ing on the language features available for expressing polymorphism and requirements on type 
parameters. In 2003 we performed a comparative study of modem language support for generic 
programming [17]. The initial study included C++, SML, Haskel, Eiffel, Java, and C#, and we 
evaluated the languages by porting a representative subset of the Boost Graph Library [13] to 
each of them. We recently updated the study to include OCaml and Cecil [18]. While some 
languages performed quite well, none were ideal for generic programming. 

Unsatisfied with the state of the art, we began to investigate how to improve language support 
for generic programming. In general we wanted a language that could express the idioms of 
generic programming while also providing modular type checking and separate compilation. In 
the context of generics, modular type checking means that a generic function or class can be type 
checked independently of any instantiation and that the type check guarantees that any well- 
typed instantiation will produce well-typed code. Separate compilation is the ability to compile a 
generic function to native assembly code that can be linked into an application in constant time. 

Our desire for modular type checking was a reaction to serious problems that plague the de- 
velopment and use of C++ template libraries. A C++ template definition is not type checked until 
after it is instantiated, making templates difficult to validate in isolation. Even worse, clients of 
template libraries are exposed to confusing error messages when they accidentally misuse the 
library. For example, the following code tries to use stable_sort with the iterators from the 
list class. 

std: : list<int> 1; 

std: : stable_sort (1 .begin () , 1 . endO ) ; 

Fig. 1 shows a portion of the error message from GNU C++. The error message includes functions 

and types that the client should not have to know about such as inplace_stable_sort and 

_List_iterator. It is not clear from the error message who is responsible for the error. The 
error message points inside the STL so the client might conclude that there is an error in the 
STL. This problem is not specific to the GNU C++ implementation, but is instead a symptom of 
the delayed type checking mandated by the C++ language definition. 



1 The term generic programming is often used to mean any use of generics, i.e., any use of parametric polymorphism or 
templates. The term is also used in the functional programming community for function generation based on algebraic 
datatypes, i.e., polytypic programming. Here, we use generic programming solely in the sense of Musser and Stepanov. 
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stLalgo.h: In function 'void std::__inplace_stable_sort(_RandomAccessIter, _RandomAccessIter) 

[with _RandomAccessIter = std::_List_iterator<int, int&, int* >]': 
stl_algo.h:2565: instantiated from 'void std::stable_sort(_RandomAccessIter, _RandomAccessIter) 

[with _RandomAccessIter = std::_List_iterator<int, int&, int* >]' 
stable_sort_error.cpp:5: instantiated from here 

stl_algo.h:2345: error: no match for 'std::_List_iterator<int, int&, int*>& std::Xist_iterator<int, int&, int*>&' operator 
stl_algo.h:2565: instantiated from 'void std::stable_sort(_RandomAccessIter, _RandomAccessIter) 

[with _RandomAccessIter = std::_List_iterator<int, int&, int*>]' 
stable_sort_error.cpp:5: instantiated from here 

stl_algo.h:2349: error: no match for 'std::_List_iterator<int, int&, int*>& std::Xist_iterator<int, int&, int*>&' operator 
stl_algo.h:2352: error: no match for 'std::_List_iterator<int, int&, int*>& std::Xist_iterator<int, int&, int*>&* operator 

Fig. 1. A portion of the error message from a misuse of stable_sort. 

Our desire for separate compilation was driven by the increasingly long compile times we (and 
others) were experiencing when composing sophisticated template libraries. With C++ templates, 
the compilation time of an application is a function of the amount of code in the application plus 
the amount of code in all template libraries used by the application (both directly and indirectly). 
We would much prefer a scenario where generic libraries can be separately compiled so that the 
compilation time of an application is just a function of the amount of code in the application. 

With these desiderata in hand we began laying the theoretical groundwork by developing the 
calculus F G [19]. F G is based on System F [20, 21], the standard calculus for parametric poly- 
morphism, and like System F, F G has a modular type checker and can be separately compiled. In 
addition, F G provides features for precisely expressing constraints on generics, introducing the 
concept feature with support for associated types and same-type constraints. The main tech- 
nical challenge overcome in F G is dealing with type equality inside of generic functions. One 
of the key design choices in F G is that models are lexically scoped, making F G more modular 
than Haskell in this regard. (We discuss this in more detail in Section 3.6.1.) Concurrently with 
our work on F G , Chakravarty, Keller, and Peyton Jones responded to our comparative study by 
developing an extension to Haskell to support associated types [22, 23]. 

The next step after F G was to add two more features needed to express generic libraries: 
concept-based overloading (used for algorithm specialization) and implicit argument deduc- 
tion. Fully general implicit argument deduction is non-trivial in the presence of first-class poly- 
mophism (which is present in Q), but some mild restrictions make the problem tractable (Sec- 
tion 3.5). However, we discovered a a deep tension between concept-based overloading and sep- 
arate compilation [24]. At this point our work bifurcated into two language designs: the lan- 
guage Q which supports separate compilation and only a basic form of concept-based overload- 
ing [25, 26], and the concepts extension to C++ [27], which provides full support for concept- 
based overloading but not separate compilation. For the next revision of the C++ Standard, popu- 
larly referred to as C++0X, separate compilation for templates was not practical because the lan- 
guage already included template specialization, a feature that is also deeply incompatible with 
separate compilation. Thus, for C++0X it made sense to provide full support for concept-based 
overloading. For Q we placed separate compilation as a higher priority, leaving out template spe- 
cialization and requiring programmers to work around the lack of full concept-based overloading 
(see Section X). 

Table 1 shows the results of our comparative study of language support for generic program- 
ming [18] augmented with new columns for C++0X and Q and augmented with three new rows: 
modular type checking (previously part of "separate compilation"), lexically scoped models, and 
concept-based overloading. Table 2 gives a brief description of the evaluation criteria. 

The rest of this article describes the design of Q in detail. We review the essential ideas of 
generic programming and survey of the idioms used in the Standard Template Library (Sec- 
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Table 1 

The level of support for generic programming in several languages. A black circle indicates full support for the feature 
or characteristic whereas a white circle indices lack of support. The rating of "-" in the C++ column indicates that while 
C++ does not explicitly support the feature, one can still program as if the feature were supported. 
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* Using the multi-parameter type class extension to Haskell [28]. 
t Using the proposed associated types extension to Haskell [23]. 



Table 2 

Glossary of Evaluation Criteria 



Criterion 



Definition 



Multi-type concepts 
Multiple constraints 
Associated type access 
Constraints on associated types 
Retroactive modeling 
Type aliases 
Separate compilation 
Implicit argument deduction 

Modular type checking 
Lexically scoped models 

Concept-based overloading 



Multiple types can be simultaneously constrained. 

More than one constraint can be placed on a type parameter. 

Types can be mapped to other types within the context of a generic function. 

Concepts may include constraints on associated types. 

The ability to add new modeling relationships after a type has been defined. 

A mechanism for creating shorter names for types is provided. 

Generic functions can be compiled independently of calls to them. 

The arguments for the type parameters of a generic function can be deduced and do 
not need to be explicitly provided by the programmer. 

Generic functions can be compiled independently of calls to them. 

Model declarations are treated like any other declaration, and are in scope for the 
remainder of enclosing namespace. Models may be explicitly imported from other 
namespaces. 

There can be multiple generic functions with the same name but differing constraints. 
For a particular call, the most specific overload is chosen. 



tion 2). This provides the motivation for the design of the language features in Q (Section 3). We 
then evaluate Q with respect to a port of the Standard Template Library (Section 4) and the Boost 
Graph Library (Section 5). We conclude with a survey of related work (Section 6) and with the 
future directions for our work (Section 7). 

This article is an updated and greatly extended version of [26], providing a more detailed 
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Generic programming is a sub-discipline of computer science that deals with finding abstract represen- 
tations of efficient algorithms, data structures, and other software concepts, and with their systematic 
organization. The goal of generic programming is to express algorithms and data structures in a broadly 
adaptable, interoperable form that allows their direct use in software construction. Key ideas include: 

- Expressing algorithms with minimal assumptions about data abstractions, and vice versa, thus 
making them as interoperable as possible. 

- Lifting of a concrete algorithm to as general a level as possible without losing efficiency; i.e., 
the most abstract form such that when specialized back to the concrete case the result is just as 
efficient as the original algorithm. 

- When the result of lifting is not general enough to cover all uses of an algorithm, additionally pro- 
viding a more general form, but ensuring that the most efficient specialized form is automatically 
chosen when applicable. 

- Providing more than one generic algorithm for the same purpose and at the same level of abstrac- 
tion, when none dominates the others in efficiency for all inputs. This introduces the necessity to 
provide sufficiently precise characterizations of the domain for which each algorithm is the most 
efficient. 



Fig. 2. Definition of Generic Programming from Jazayeri, Musser, and Loos [29] 

rationale for the design of Q and extending our previous comparative study to include Q by 
evaluating a port of the Boost Graph Library to Q. 

2. Generic Programming and the STL 

Fig. 2 reproduces the standard definition of generic programming from Jazayeri, Musser, and 
Loos [29]. The generic programming methodology always consists of the following steps: 1) 
identify a family of useful and efficient concrete algorithms with some commonality, 2) resolve 
the differences by forming higher-level abstractions, and 3) lift the concrete algorithms so they 
operate on these new abstractions. When applicable, there is a fourth step to implement automatic 
selection of the best algorithm, as described in Fig. 2. 

2.1. Type requirements, concepts, and models 

The merge algorithm from the STL, shown in Fig. 3, serves as a good example of generic pro- 
gramming. The algorithm does not directly work on a particular data structure, such as an array 
or linked list, but instead operates on an abstract entity, a concept. A concept is a collection of 
requirements on a type, or to look at it a different way, it is the set of all types that satisfy the re- 
quirements. For example, the Input Iterator concept requires that the type have an increment and 
dereference operation, and that both are constant-time operations. (We italicize concept names.) 
A type that meets the requirements is said to model the concept. (It helps to read "models" as 
"implements".) For example, the models of the Input Iterator concept include the built-in pointer 
types, such as int*, the iterator type for the std: :list class, and the istream_iterator 
adaptor. Constraints on type parameters are primarily expressed by requiring the corresponding 
type arguments to model certain concepts. In the merge template, the argument for Inlterl is 
required to model the Input Iterator concept. Type requirements are not expressible in C++, so the 
convention is to specify type requirements in comments or documentation as in Fig. 3. 
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Fig. 3. The merge algorithm in C++. 



1 

template<typename Inlterl, typename Inlter2, typename Dutlter> 
// where Inlterl models Input Iterator, Inlter2 models Input Iterator. 
// Outlter models Output Iterator, writing the value -type of Inlterl. 
// The value -type of Inlterl and Inlter2 are the same type. 
// The value Jype of Inlterl is Less Than Comparable. 
Outlter merge (Inlterl firstl, Inlterl lastl, 

Inlter2 first2, Inlter2 last2, Outlter result) { 
while (firstl != lastl && first2 != Iast2) { 
if (*first2 < *firstl) { 

♦result = *first2; ++first2; 
} else { 

♦result = *firstl; ++firstl; 

} 

++result ; 

} 

return copy(first2, last2, copy(firstl, lastl, result)); 

} 

i 



The type requirements for merge refer to relationships between types, such as the value_type 
of Inlterl. This is an example of an associated type, which maps between types that are part 
of a concept. The merge algorithm also needs to express that the value_type of Inlterl and 
Inlter2 are the same, which we call same-type constraints. Furthermore, the merge algorithm 
includes an example of how associated types and modeling requirements can be combined: the 
value_type of the input iterators is required to be Less Than Comparable. 

Fig. 4 shows the definition of the Input Iterator concept following the presentation style 
used in the SGI STL documentation [30, 31]. In the description, the variable X is used as a 
place holder for the modeling type. The Input Iterator concept requires several associated types: 
value_type, dif f erence_type, and iterator_category. Associated types change from 
model to model. For example, the associated value_type for int* is int and the associated 
value_type for list<char> : : iterator is char. The Input Iterator concept requires that the 
associated types be accessible via the iterator_traits class. (Traits classes are discussed in 
Section 2.4). The count algorithm, which computes the number of occurrences of a value within 
a sequence, is a simple example for the need of this access mechanism, for it needs to access the 
dif f erence_type to specify its return type: 

template-ctypename Iter, typename T> 

typename iterator_traits<Iter> : :dif f erence_type 

count (Iter first, Iter last, const T& value); 

The reason that count uses the iterator-specific dif f erence_type instead of int is to accom- 
modate iterators that traverse sequences that may be too long to be measured with an int. 

In general, a concept may consist of the following kinds of requirements, 
refinements are analogous to inheritance. They allow one concept to include the requirements 

from another concept, 
operations specify the functions that must be implemented for the modeling type. 
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associated types specify mappings between types, and in C++ are provided using traits classes, 
which we discuss in Section 2.4. 

nested requirements include requirements on associated types such as modeling a certain con- 
cept or being the same-type as another type. For example, the Input Iterator concept requires 
that the associated dif f erence_type be a signed integral type. 

semantic invariants specify behavioral expectations about the modeling type. 

complexity guarantees specify constraints on how much time or space may be used by an op- 
eration. 



2.2. Overview of the STL 

The high-level structure of the STL is shown in Fig. 5. The STL contains over fifty generic 
algorithms and 18 container classes. The generic algorithms are implemented in terms of a family 
of iterator concepts, and the containers each provide iterators that model the appropriate iterator 
concepts. As a result, the STL algorithms may be used with any of the STL containers. In fact, 
the STL algorithms may be used with any data structure that exports iterators with the required 
capabilities. 

Fig. 6 shows the hierarchy of STL's iterator concepts. An arrow indicates that the source con- 
cept is a refinement of the target. The iterator concepts arose from the requirements of algorithms: 
the need to express the minimal requirements for each algorithm. For example, the merge algo- 
rithm passes through a sequence once, so it only requires the basic requirements of Input Iterator 
for the two ranges it reads from and Output Iterator for the range it writes to. The search 
algorithm, which finds occurrences of a particular subsequence within a larger sequence, must 
make multiple passes through the sequence so it requires Forward Iterator. The inplace_merge 
algorithm needs to move backwards and forwards through the sequence, so it requires Bidirec- 
tional Iterator. And finally, the sort algorithm needs to jump arbitrary distances within the 
sequence, so it requires Random Access Iterator. (The sort function uses the introsort algo- 
rithm [32] which is partly based on quicksort [33].) Grouping type requirements into concepts 
enables significant reuse of these specifications: the Input Iterator concept is directly used as a 
type requirement in over 28 of the STL algorithms. The Forward Iterator, which refines Input 
Iterator, is used in the specification of over 22 STL algorithms. 

The STL includes a handful of common data structures. When one of these data structures does 
not fulfill some specialized purpose, the programmer is encouraged to implement the appropriate 
specialized data structure. All of the STL algorithms can then be made available for the new data 
structure at the small cost of implementing iterators. 

Many of the STL algorithms are higher-order: they take functions as parameters, allowing the 
user to customize the algorithm to their own needs. The STL defines over 25 function objects for 
creating and composing functions. 

The STL also contains a collection of adaptor classes, which are parameterized classes that 
implement some concept in terms of the type parameter (which is the adapted type). For example, 
the back_insert_iterator adaptor implements Output Iterator in terms of any model of Back 
Insertion Sequence. The generic copy algorithm can then be used with back_insert_iterator 
to append some integers to a list. Adaptors play an important role in the plug-and-play nature of 
the STL and enable a high degree of reuse. 
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Input Iterator 

Description 

An Input Iterator is an iterator that may be dereferenced to refer to some object, and that may be incremented 
to obtain the next iterator in a sequence. Input Iterators are not required to be mutable. The underlying sequence 
elements is not required to be persistent. For example, an Input Iterator could be reading input from the terminal. 
Thus, an algorithm may not make multiple passes through a sequence using an Input Iterator. 

Refinement of 

Trivial Iterator. 

Notation 

X A type that is a model of Input Iterator 
T The value type of X 
i , j Objects of type X 
t Object of type T 

Associated types 

iterator_traits<X> : :value_type 

The type of the value obtained by dereferencing an Input Iterator 
iterator_traits<X> : : dif f erence_type 

A signed integral type used to represent the distance from one iterator to another, or the number of elements in 
a range. 

iterator_traits<X> : : iterator_category 
A type convertible to input_iterator_tag. 

Definitions 

An iterator is past-the-end if it points beyond the last element of a container. Past-the-end values are nonsingular 
and nondereferenceable. An iterator is valid if it is dereferenceable or past-the-end. An iterator i is Incrementable 
if there is a "next" iterator, that is, if ++i is well-defined. Past-the-end iterators are not incrementable. An Input 
Iterator j is reachable from an Input Iterator i if, after applying operator++ to i a finite number of times, i 
== j . The notation [i , j ) refers to a range of iterators beginning with i and up to but not including j . The range 
[i , j ) is a valid range if both i and j are valid iterators, and j is reachable from i. 



Valid expressions 

In addition to the expressions in Trivial Iterator, the following expressions must be valid. 



expression 


return type 


semantics, pre/post-conditions 


*i 


Convertible to T 


pre: i is incrementable 


++i 


X& 


pre: i is dereferenceable, post: i is dereferenceable or past the end 


i++ 




Equivalent to (void)++i. 






Equivalent to {T t = *i; ++i; return t;} 



Complexity guarantees 

All operations are amortized constant time. 

Models 

istream_iterator, int*, list<string> : : iterator, ... 



Fig. 4. Documentation for the Input Iterator concept. 
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Algorithms 



Iterator Interfaces 



Containers 




Fig. 5. High-level structure of the STL. 



Input 



Random Access 



Bidirectional 



-► Forward 



Output 



Fig. 6. The refinement hierarchy of iterator concepts. 



2.3. The problem of argument dependent name lookup in C++ 

In C++, uses of names inside of a template definition, such as the use of operator< inside of 
merge, are resolved after instantiation. For example, when merge is instantiated with an iterator 
whose value_type is of type f oo : :bar, overload resolution looks for an operator< defined 
for f oo: :bar. If there is no such function defined in the scope of merge, the C++ compiler also 
searches the namespace where the arguments' types are defined, so looks for operator< in 
namespace f oo. This rule is known as argument dependent lookup (ADL). 

The combination of implicit instantiation and ADL makes it convenient to call generic func- 
tions. This is a nice improvement over passing concept operations as explicit arguments to a 
generic function, as in the inc example from Section 1. However, ADL has two flaws. The 
first problem is that the programmer calling the generic algorithm no longer has control over 
which functions are used to satisfy the concept operations. Suppose that namespace f oo is a 
third party library and the application programmer writing the main function has defined his 
own operator< for f oo : :bar. ADL does not find this new operators 

The second and more severe problem with ADL is that it opens a hole in the protection that 
namespaces are suppose to provide. ADL is applied uniformly to all name lookup, whether or not 
the name is associated with a concept in the type requirements of the template. Thus, it is possible 
for calls to helper functions to get hijacked by functions with the same name in other namespaces. 
Fig. 7 shows an example of how this can happen. The function template lib: :generic_fun 
calls load with the intention of invoking lib: :load. In main we call generic_fun with an 
object of type foo: :bar, so in the call to load, x also has type foo: :bar. Thus, argument 
dependent lookup also consider namespace foo when searching for load. There happens to be 
a function named load in namespace foo, and it is a slightly better match than lib : : f oo, so it 
is called instead, thereby hijacking the call to load. 
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Fig. 7. Example problem caused by ADL. 



I 

namespace lib { 

template<typename T> void load(T x, string) 

{ std::cout << "Proceeding as normal!\n"; } 
template<typename T> void generic_f un(T x) 

{ load(x, "file") ; } 

} 

namespace foo { 

struct bar { int n; }; 

template<typename T> void load(T x, const char*) 
{ std::cout « "Hijacked ! \n" ; } 

} 

int mainO { 
foo : :bar a; 
lib: : generic_f un(a) ; 

} 

// Output: Hijacked! 

i 



2.4. Traits classes, template specialization, and separate type checking 

The traits class idiom plays an important role in writing generic algorithms in C++. Unfortu- 
nately there is a deep incompatibility between the underlying language feature, template special- 
ization, and our goal of separate type checking. 

A traits class [34] maps from a type to other types or functions. Traits classes rely on C++ tem- 
plate specialization to perform this mapping. For example, the following is the primary template 
definition for iterator_traits. 

template<typename Iterator> 
struct iterator_traits { ... }; 

A specialization of iterator_traits is defined by specifying particular type arguments for 
the template parameter and by specifying an alternate body for the template. The code below 
shows a user-defined iterator class, named my_iter, and a specialization of iterator_traits 
for my_iter. 

class my_iter { 

float operator*() { ... } 

>;" 

templateO struct iterator_traits<my_iter> { 
typedef float value_type; 
typedef int dif f erence_type ; 

typedef input_iterator_tag iterator_category ; 
}; ' 

When the type iterator_traits<my_iter> is used in other parts of the program it refers to 
the above specialization. In general, a template use refers to the most specific specialization that 
matches the template arguments, if there is one, or else it refers to an instantiation of the primary 
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template definition. 

The use of iterator_traits within a template (and template specialization) represents a 
problem for separate compilation. Consider how a compiler might type check the following 
unique_copy function template. 

template<typename Inlter, typename 0utlter> 

Outlter unique_copy (Inlter first, Inlter last, Outlter result) { 
typename iterator_traits<InIter> : : value_type value = *first; 
//... 

} 

To check the first line of the body, the compiler needs to know that the type of *f ir st is the same 
type as (or at least convertible to) the value_type member of iterator_traits<InIter>. 
However, prior to instantiation, the compiler does not know what type Inlter will be instantiated 
to, and which specialization of iterator_traits to choose (and different specializations may 
have different definitions of the value_type). 

Thus, if we hope to provide modular type checking, we must develop and alternative to using 
traits classes for accessing associated types. 

2.5. Concept-based overloading using the tag dispatching idiom 

One of the main points in the definition of generic programming in Fig. 2 is that it is sometimes 
necessary to provide more than one generic algorithm for the same purpose. When this happens, 
the standard approach in C++ libraries is to provide automatic dispatching to the appropriate 
algorithm using the tag dispatching idiom or enable_if [35]. Fig. 8 shows the advance algo- 
rithm of the STL as it is typically implemented using the tag dispatching idiom. The advance 
algorithm moves an iterator forward (or backward) n positions. There are three overloads of 
advance_dispatch, each with an extra iterator tag parameter. The C++ Standard Library defines 
the following iterator tag classes, with their inheritance hierarchy mimicking the refinement hi- 
erarchy of the corresponding concepts. 

struct input_iterator_tag {} ; 

struct output_iterator_tag {}; 

struct f orward_iterator_tag : public input_iterator_tag {}; 

struct bidirectional_iterator_tag : public f orward_iterator_tag {}; 

struct random_access_iterator_tag : public bidirectional_iterator_tag {}; 

The main advance function obtains the tag for the particular iterator from iterator_traits 
and then calls advance_dispatch. Normal static overload resolution then chooses the appro- 
priate overload of advance_dispatch. Both the use of traits and the overload resolution rely 
on knowing actual argument types of the template and the late type checking of C++ templates. 
So the tag dispatching idiom provides another challenge for designing a language for generic 
programming with separate type checking. 

2.6. Reverse iterators and conditional models 

The reverse_ iterator class template adapts a model of Bidirectional Iterator and imple- 
ments Bidirectional Iterator, flipping the direction of traversal so operator++ goes backwards 
and operator — goes forwards. An excerpt from the reverse_iterator class template is 
shown below. 
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Fig. 8. The advance algorithm and the tag dispatching idiom. 



I 

template<typename Inlter, typename Distance> 

void advance_dispatch(lnlter& i, Distance n, input_iterator_tag) { 
while (n — ) ++i; 

} 

template<typename Bidirlter, typename Distance> 
void advance_dispatch(BidirIter& i, Distance n, 

bidirectional_iterator_tag) { 

if (n > 0) while (n— ) ++i; 

else while (n++) — i; 

} 

template<typename Randlter, typename Distance> 
void advance_dispatch(RandIter& i, Distance n, 

random_access_iterator_tag) { 

i += n; 

} 

template<typename Inlter, typename Distance> 
void advance (Inlterfe i, Distance n) { 

typename iterator_traits<InIter> : : iterator_category cat; 

advance_dispatch(i , n, cat); 

} 

i 



template<typename Iter> 
class reverse_iterator { 
protected: 

Iter current; 
public : 

explicit reverse_iterator (Iter x) : current (x) { } 
reference operator*() const { Iter tmp = current; return * — tmp; } 
reverse_iterator& operator++() { — current; return *this; } 
reverse_iterator& operator — () { ++current; return *this; } 
reverse_iterator operator+(dif f erence_type n) const 
{ return reverse_iterator (current - n) ; } 

>; 

The reverse_iterator class template is an example of a type that models a concept con- 
ditionally: if Iter models Random Access Iterator, then so does reverse_iterator<Iter>. 
The definition of reverse_iterator defines all the operations, such as operator+, required 
of a Random Access Iterator. The implementations of these operations rely on the Random Ac- 
cess Iterator operations of the underlying Iter. One might wonder why reverse_iterator 
can be used on iterators such as list<int> : : iterator that are bidirectional but not random 
access. The reason this works is that a member function such as operator+ is type checked and 
compiled only if it is used. For Q we need a different mechanism to handle this, since function 
definitions are always type checked. 
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2.7. Summary of language requirements 

In this section we surveyed how generic programming is accomplished in C++, taking note of 
the variety of language features and idioms that are used in current practice. In this section we 
summarize the findings as a list of requirements for a language to support generic programming. 

(i) The language provides type parameterized functions with the ability to express constraints 
on the type parameters. The definitions of parameterized functions are type checked inde- 
pendently of how they are instantiated. 

(ii) The language provides a mechanism, such as "concepts", for naming and grouping re- 
quirements on types, and a mechanism for composing concepts (refinement). 

(iii) Type requirements include: 

- requirements for functions and parameterized functions 

- associated types 

- requirements on associated types 

- same-type constraints 

(iv) The language provides an implicit mechanism for providing type-specific operations to a 
generic function, but this mechanism should maintain modularity (in contrast to argument 
dependent lookup in C++). 

(v) The language implicitly instantiates generic functions when they are used. 

(vi) The language provides a mechanism for concept-based dispatching between algorithms. 

(vii) The language provides function expressions and function parameters. 

(viii) The language supports conditional modeling. 

3. The Design of Q 

Q is a statically typed imperative language with syntax and memory model similar to C++. 
We have implemented a compiler that translates Q to C++, but Q could also be interpreted or 
compiled to byte-code. Compilation units are separately type checked and may be separately 
compiled, relying only on forward declarations from other compilation units (even compilation 
units containing generic functions and classes). The languages features of Q that support generic 
programming are the following: 

- Concept and model definitions, including associated types and same-type constraints; 

- Constrained polymorphic functions, classes, stmcts, and type-safe unions; 

- Implicit instantiation of polymorphic functions; and 

- Concept-based function overloading. 

In addition, Q includes the basic types and control constructs C++. 
3.1. Concepts 

The following grammar defines the syntax for concepts. 

decl *— concept cid<tyid , . . .> { cmem ... }; 
cmem <— funsig \ fundef // Required operations 

| type tyid; // Associated types 

| type == type ; // Same type constraints 

| refines cid<type , ...>; 

| require cid<type , ...>; 
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Fig. 9. The definition of the Input Iterator concept in Q. 

1 

concept InputIterator<X> { 

type value ; 

type difference; 

refines EqualityComparable<X> ; 

ref ines Regular<X> ; // Regular refines Assignable and CopyConstructible 
require SignedlntegraKdif f erence> ; 
fun operator* (X b) -> value®; 
fun operator++(X! c) -> X!; 

>; 



The grammar variable cid is for concept names and tyid is for type variables. The type variables 
are place holders for the modeling type (or a list of types for multi-type concepts), funsig and 
fundef are function signatures and definitions, whose syntax we introduce later in this section. 
In a concept, a function signature says that a model must define a function with the specified 
signature. A function definition in a concept provides a default implementation. 

The syntax type tyid ; declares an associated type; a model of the concept must provide a type 
definition for the given type name. The syntax type == type introduces a same type constraint. In 
the context of a model definition, the two type expressions must refer to the same type. When the 
concept is used in the type requirements of a polymorphic function or class, this type equality 
may be assumed. Type equality in Q is non-trivial, and is explained in Section 3.9. Concepts 
may be composed with refines and require. The distinction is that refinement brings in the 
associated types from the "super" concept. Fig. 9 shows an example of a concept definition in 
Q, the definition of Input Iterator. 

3.2. Models 

The modeling relation between a type and a concept is established with a model definition 
using the following syntax. 

deel <— model [<tyid, . . .>] [where { constraint, . . . }] cid<type , . . .> { deel ...}; 

The following shows an example of the Monoid concept and a model definition that makes int 
a model of Monoid, using addition for the binary operator and zero for the identity element. 

concept Monoid<T> { 

fun identity_elt () -> T@; 
fun binary_op(T,T) -> TO; 

}; 

model Monoid<int> { 

fun binary_op(int x, int y) -> intS { return x + y; } 
fun identity_elt () -> int@ { return 0; } 

>; 

A model definition must satisfy all requirements of the concept. Requirements for associated 
types are satisfied by type definitions. Requirements for operations may be satisfied by function 
definitions in the model, by the where clause, or by functions in the lexical scope preceding the 
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Fig. 10. reverse_iterator conditionally models the Random Access Iterator concept. 



model <Iter> where { RandomAccessIterator<Iter> } 
RandomAccessIterator< reverse_iterator<Iter> > 
{ 

fun operator+(reverse_iterator<Iter> r, difference n) 
-> reverse_iterator<Iter>@ 
{ return @reverse_iterator<Iter>(r. current + n) ; } 
fun operator- (reverse_iterator<Iter> r, difference n) 
-> reverse_iterator<Iter>@ 
{ return @reverse_iterator<Iter>(r. current 
fun operator- (reverse_iterator<Iter> a, 
-> difference 
{ return a. current - b. current; } 



n); } 

reverse_iterator<Iter> b) 



>; 



model definition. Refinements and nested requirements are satisfied by preceding model defini- 
tions in the lexical scope or by the where clause. 

A model may be parameterized by placing type variables inside o's after the model keyword. 
The following definition establishes that all pointer types are models of Input Iterator. 

model <T> InputIterator<T*> { 
type value = T; 
type difference = ptrdiff_t; 

>; 

The optional where clause in a model definition can be used to introduce constraints on the type 
variables. Constraints are either modeling constraints or same-type constraints. 

constraint*— cid<type , . . .> | type == type 

Using the where clause we can express conditional modeling. As mentioned in Section 2.6, we 
need conditional modeling to say that reverse_iterator is a model of Random Access Iterator 
whenever the underlying iterator is. Fig. 10 shows is a model definition that says just this. 

The rules for type checking parameterized model definitions with constraints is essentially the 
same as for generic functions, which we discuss in Section 3.4. 



3.3. Nominal versus structural conformance 

One of the fundamental design choices of Q was to include model definitions. After all, it 
is possible to instead have the compiler figure out when a type has implemented all of the re- 
quirements of a concept. We refer to the approach of using explicit model definitions nominal 
conformance whereas the implicit approach we call structural conformance. An example of the 
nominal versus structural distinction can be seen in the example below. Do the concepts create 
two ways to refer to the same concept or are they different concepts that happen to have the same 
constraints? 
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concept A<T> { 

fun foo(T x) -> T; 

}; 



concept B<T> { 

fun foo(T x) -> T; 

>; 



With nominal conformance, the above are two different concepts, whereas with structural con- 
formance, A and B are two names for the same concept. Examples of language mechanisms 
providing nominal conformance include Java interfaces and Haskell type classes. Examples of 
language mechanisms providing structural conformance include ML signatures [36], Objective 
Caml object types [37], CLU type sets [38], and Cforall specifications [39]. 

Choosing between nominal and structural conformance is difficult because both options have 
good arguments in their favor. 

Structural conformance is more convenient than nominal conformance With nominal con- 
formance, the modeling relationship is established by an explicit declaration. For example, a Java 
class declares that it implements an interface. In Haskell, an instance declaration establishes 
the conformance between a particular type and a type class. When the compiler sees the explicit 
declaration, it checks whether the modeling type satisfies the requirements of the concept and, if 
so, adds the type and concept to the modeling relation. 

Structural conformance, on the other hand, requires no explicit declarations. Instead, the com- 
piler determines on a need-to-know basis whether a type models a concept. The advantage is that 
programmers need not spend time writing explicit declarations. 

Nominal conformance is safer than structural conformance The usual argument against 
structural conformance is that it is prone to accidental conformance. The classic example of this 
is a cowboy object being passed to something expecting a Window [40]. The Window interface 
includes a drawO method, which the cowboy has, so the type system does not complain even 
though something wrong has happened. This is not a particularly strong argument because the 
programmer has to make a big mistake for this kind accidental conformance to occur. 

However, the situation changes for languages that support concept-based overloading. For 
example, in Section 2.5 we discussed the tag-dispatching idiom used in C++ to select the best 
advance algorithm depending on whether the iterator type models Random Access Iterator or 
only Input Iterator. With concept-based overloading, it becomes possible for accidental confor- 
mance to occur without the programmer making a mistake. The following C++ code is an example 
where an error would occur if structural conformance were used instead of nominal. 

std: : vector<int> v; 

std: : istream_iterator<int> in(std: : cin) , in_end; 
v. insert (v. begin () , in, in_end) ; 

The vector class has two versions of insert, one for models of Input Iterator and one for 
models of Forward Iterator. An Input Iterator may be used to traverse a range only a single time, 
whereas a Forward Iterator may traverse through its range multiple times. Thus, the version of 
insert for Input Iterator must resize the vector multiple times as it progresses through the input 
range. In contrast, the version of insert for Forward Iterator is more efficient because it first 
discovers the length of the range (by calling std: : distance, which traverses the input range), 
resizes the vector to the correct length, and then initializes the vector from the range. 

The problem with the above code is that istream_iterator fulfills the syntactic require- 
ments for a Forward Iterator but not the semantic requirements: it does not support multiple 
passes. That is, with structural conformance, there is a false positive and insert dispatches to 
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the version for Forward Iterators. The program resizes the vector to the appropriate size for all 
the input but it does not initialize the vector because all of the input has already been read. 

Why not both? It is conceivable to provide both nominal and structural conformance on a 
concept-by-concept basis, which is in fact the approach used in the concept extension for C++0X. 
Concepts that are intended to be used for dispatching could be nominal and other concepts could 
be structural. This matches the current C++ practice: some concepts come with traits classes that 
provide nominal conformance whereas other concepts do not (the default situation with C++ tem- 
plates is structural conformance). However, providing both nominal conformance and structural 
conformance complicates the language, especially for programmers new to the language, and de- 
grades its uniformity. Therefore, with Q we provide only nominal conformance, giving priority 
to safety and simplicity over convenience. 

3.4. Generic Functions 

The syntax for generic functions is shown below. The name of the function is the identifier 
after fun, the type parameters are between the <>'s and are constrained by the requirement in 
the where clause. A function's parameters are between the () 's and the return type of a function 
comes after the ->. 

fundef <— fun id [<tyid, . . ■>] [where { constraint, . . . }] 

(type pass [id], ...) -> type pass { stmt ... } 
funsig <— fun id [<tyid,...>] [where { constraint, . . . }] 

(.type pass [id], . . .) -> type pass; 
decl <— fundef \ funsig 
pass <— mut ref // pass by reference 

| @ // pass by value 

mut <— const | e // constant 

| ! // mutable 

ref «- & | e 

The default parameter passing mode in Q is read-only pass-by-reference. Read-write pass-by- 
reference is indicated by ! and pass-by-value is indicated by 0. 

The merge algorithm, implemented as a generic function in Q, is shown in Fig. 11. The func- 
tion is parameterized on three types: Iterl, Iter2, and Iter3. The dot notation is used to refer 
to a member of a model, including associated types such as the value type of an iterator. 

assoc <— cid<type , ...>.id \ cid<type , ...>.assoc 

type <— assoc 

The Output Iterator concept used in the merge function is an example of a multi-parameter 
concept. It has a type parameter X for the iterator and a type parameter T for the type that can be 
written to the iterator. The following is the definition of the Output Iterator concept. 

concept OutputIterator<X,T> { 
refines Regular<X>; 
fun operator«(X! c, T t) -> X!; 

>; 

In general the body of a generic function contains a sequence of statements. Syntax for some 
of the statements in Q is defined in the following grammar. 

stmt <— let id = expr; | while (expr) stmt | return expr; | expr; | ... 
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Fig. 11. The merge algorithm in Q. 



fun merge<Iterl , Iter2 , Iter3> 

where { InputIterator<Iterl> , InputIterator<Iter2> , 

LessThanComparable<InputIterator<Iterl> . value> , 
InputIterator<Iterl> . value == InputIterator<Iter2> . value , 
0utputlterator<lter3 , InputIterator<Iterl> . value> } 

(IterlQ firstl, Iterl lastl, Iter2@ first2, Iter2 last2, Iter3@ result) 
-> Iter3@ 

{ 

while (firstl != lastl and first2 != Iast2) { 
if (*first2 < *firstl) { 

result « *first2; ++first2; 
} else { 

result « *firstl; ++firstl; 

} 

} 

return copy(first2, last2, copy(firstl, lastl, result)); 

} 



The let form introduces local variables, deducing the type of the variable from the right-hand 
side expression (similar to the auto proposal for C++0X [41]). 

The body of a generic function is type checked separately from any instantiation of the func- 
tion. The type parameters are treated as abstract types so no type-specific operations may be 
applied to them unless otherwise specified by the where clause. The where clause introduces 
surrogate model definitions and function signatures (for all the required concept operations) into 
the scope of the function. 

Multiple functions with the same name may be defined, and static overload resolution is per- 
formed by Q to decide which function to invoke at a particular call site depending on the argument 
types and also depending on which model definitions are in scope. When more than one overload 
may be called, the most specific overload is called (if one exists) according to the rules described 
in Section 3.10. 

3.5. Function calls and implicit instantiation 

The syntax for calling functions (or polymorphic functions) is the C-style notation: 

expr <— expr ( expr , . . . ) 

Arguments for the type parameters of a polymorphic function need not be supplied at the call 
site: Q will deduce the type arguments by unifying the types of the arguments with the types 
of the parameters and then implicitly instantiate the polymorphic function. The design issues 
surrounding implicit instantiation are described below. All of the requirements in the where 
clause must be satisfied by model definitions in the lexical scope preceding the function call, 
as described in Section 3.6. The following is an example of calling the generic accumulate 
function. In this case, the generic function is implicitly instantiated with type argument int*. 

fun mainO -> int@@ { 
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let a = new int [8] ; 

a[0] = 1; a[l] = 2; a[2] = 3; a[3] = 4; a [4] = 5; 
let s = accumulate (a, a + 5) ; 
if (s == 15) return 0; 
else return -1; 

} 

A polymorphic function may be explicitly instantiated using this syntax: 
expr <— expr< \ty , . . . I > 

Following Mitchell [42] we view implicit instantiation as a kind of coercion that transforms 
an expression of one type to another type. In the example above, the accumulate function was 
coerced from 

fun <Iter> where 

{ InputIterator<Iter> , Monoid<InputIterator<Iter> . value> } 
(Iter®, Iter) -> InputIterator<Iter> . value® 

to 

fun (int*0, int*) -> InputIterator<int*> .value® 

There are several kinds of implicit coercions in Q, and together they form a subtyping relation 
<. The subtyping relation is reflexive and transitive. Like C++, Q contains some bidirectional 
implicit coercions, such as float < double and double < float, so < is not anti-symmetric. 
The subtyping relation for Q is defined by a set of subtyping rules. The following is the subtyping 
rule for generic function instantiation. 

r satisfies c 

(Inst) 

r h f un<a>where{c} (?f) ->r < [p/a] (fun (ct)->t) 

The type parameters a are substituted for type arguments p and the constraints in the where 
clause must be satisfied in the current environment. To apply this rule, the compiler must choose 
the type arguments. We call this type argument deduction and discuss it in more detail momen- 
tarily. Constraint satisfaction is discussed in Section 3.6. 

The subtyping relation allows for coercions during type checking according to the subsump- 
tion rule: 

r h e : (T r h a < t 

(SUB) — = 

1 h e : r 

The (Sub) rule is not syntax-directed so its addition to the type system would result in a non- 
deterministic type checking algorithm. The standard workaround is to omit the above rule and 
instead allow coercions in other rules of the type system such as the rule for function application. 
The following is a rule for function application that allows coercions in both the function type 
and in the argument types. 

r I- ei : n T \- e2 : 02 T h Ti < f un(?73) ->r 2 T h < <f% 

(App) t^T — 7=\ 

T h ei(e 2 ) : r 2 

As mentioned above, the type checker must guess the type arguments p to apply the (Inst) 
rule. In addition, the (App) rule includes several types that appear from nowhere: as and tJ. 
The problem of deducing these types is equivalent to trying to find solutions to a system of 
inequalities. Consider the following example program. 

fun apply<T>(fun(T)->T f, T x) -> T { return f(x); } 

fun id<U>(U a) -> U { return a; } 

fun mainO -> int@ { return applydd, 0) ; } 
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The application apply (id , 0) type checks if there is a solution to the following system: 

fun<T>(fun(T)->T, T) -> T < fun (a, (3) -> 7 
fun<U>(U)->U < a 

int < (3 

The following type assignment is a solution to the above system. 

a = f un(int) ->int 

(3 = int 
7 = int 

Unfortunately, not all systems of inequalities are as easy to solve as the above example. In fact, 
with Mitchell's original set of subtyping rules, the problem of solving systems of inequalities was 
proved undecidable by Tiuryn and Urzyczyn [43]. There are several approaches to dealing with 
this undecidability. 

Remove the (ARROW) rule. Mitchell's subtyping relation included the usual co/contravariant 
rule for functions. 

0-2 < cri n < T 2 
(Arrow) — — 

fun((Tl)->Tl < fun ((72) ~>T2 

The (Arrow) rule is nice to have because it allows a function to be coerced to a different type 
so long as the parameter and return types are coercible in the appropriate way. In the following 
example the standard ilogb function is passed to f 00 even though it does not match the expected 
type. The (Arrow) rule allows for this coercion because int is coercible to double. 

include "math.h" ; //fun ilogbf double x) > int; 

fun f oo(fun(int)->int@ f) -> int® { return f(l); } 

fun mainO -> int@ { return foo(ilogb); } 

However, the (Arrow) rule is one of the culprits in the undecidability of the subtyping 
problem; removing it makes the problem decidable [43]. The language ML F of Le Botlan and 
Remy [44] takes this approach, and for the time being, so does Q. With this restriction, type ar- 
gument deduction is reduced to the variation of unification defined in [44]. Instead of working on 
a set of variable assignments, this unification algorithm keeps track of either a type assignment 
or the tightest lower bound seen so far for each variable. The (App) rule for Q is reformulated as 
follows to use this unify algorithm. 

T h ei : t\ _ T h e2~ : W2 

Q = {n <a,W^<P} Q / = unify(q,fun(/3)->7,Q) 

(APP) rhei(e 2 -):Q'( 7 ) 
In languages where functions are often written in curried form, it is important to provide even 
more flexibility than in the above (App) rule by postponing instantiation, as is done in ML F . 
Consider the apply example again, but this time written in curried form. 

fun apply<T>(fun(T)->T f) -> (fun(T)->T)@ { 
return fun(T x) { return f (x) ; }; 

} 

fun id<U>(U a) -> U { return a; } 

fun mainO -> int@ { return apply (id) (0) ; } 

In the first application apply (id) we do not yet know that T should be bound to int. The 
instantiation needs to be delayed until the second application apply (id) (0). In general, each 
application contributes to the system of inequalities that needs to be solved to instantiate the 
generic function. In ML , the return type of each application encodes a partial system of m- 
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equalities. The inequalities are recorded in the types as lower bounds on type parameters. The 
following is an example of such a type. 

fun<U> where { fun<T>(T)->T < U } (U) -> U 

Postponing instantiation is not as important in Q because functions take multiple parameters and 
currying is seldom used. 

Removal of the arrow rule means that, in some circumstances, the programmer would have to 
wrap a function inside another function before passing the function as an argument. 

Restrict the language to predicative polymorphism Another alternative is to restrict the 
language so that only monotypes (non-generic types) may be used as the type arguments in an 
instantiation. This approach is used in by Odersky and Laufer [45] and also by Peyton Jones and 
Shields [46]. However, this approach reduces the expressiveness of the language for the sake of 
the convenience of implicit instantiation. 

Restrict the language to second-class polymorphism Restricting the language of types to 
disallow polymorphic types nested inside other types is another way to make the subtyping 
problem decidable. With this restriction the subtyping problem is solved by normal unification. 
Languages such as SML and Haskell 98 use this approach. Like the restriction to predicative 
polymorphism, this approach reduces the expressiveness of the language for the sake of implicit 
instantiation (and type inference). However, there are many motivating use cases for first-class 
polymorphism [47], so throwing out first-class polymorphism is not our preferred alternative. 

Use a semi-decision procedure Yet another alternative is to use a semi-decision procedure for 
the subtyping problem. The advantage of this approach is that it allows implicit instantiation to 
work in more situations, though it is not clear whether this extra flexibility is needed in practice. 
The down side is that there are instances of the subtyping problem where the procedure diverges 
and never returns with a solution. 

3.6. Model lookup (constraint satisfaction) 

The basic idea behind model lookup is simple although some of the details are a bit compli- 
cated. Consider the following program containing a generic function f oo with a requirement for 
C<T>. 

concept C<T> { }; 
model C<int> { }; 

fun foo<T> where { C<T> } (T x) -> T { return x; } 

fun mainO -> int@ { 

return f oo(0) ;// lookup model C<int> 

} 

At the call f oo (0) , the compiler deduces the binding T=int and then seeks to satisfy the where 
clause, with int substituted for T. In this case the constraint C<int> must be satisfied. In the 
scope of the call f oo (0) there is a model definition for C<int>, so the constraint is satisfied. We 
call C<int> the model head. 

3.6. 1 . Lexical scoping of models 

The design choice to look for models in the lexical scope of the instantiation is an important 
choice for Q, and differentiates it from both Haskell and the concept extension for C++. This 
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Fig. 12. Intentionally overlapping models. 



I 

module A { 

model Monoid<int> { 

fun binary_op(int x, int y) -> int® { return x + y; } 
fun identity_elt () -> int@ { return 0; } 

}; 

fun sum<Iter> (Iter first, Iter last) -> int { 
return accumulate (first , last); 

} 

} 

module B { 

model Monoid<int> { 

fun binary_op(int x, int y) -> int® { return x * y; } 
fun identity_elt () -> intO { return 1; } 

>; 

fun product<Iter> (Iter first, Iter last) -> int { 
return accumulate (first , last); 

} 

} 

i 



choice improves the modularity of Q by preventing model declarations in separate modules from 
accidentally conflicting with one another. 

For example, in Fig. 12 we create sum and product functions in modules A and B respec- 
tively by instantiating accumulate in the presence of different model declarations. This exam- 
ple would not type check in Haskell, even if the two instance declarations were to be placed in 
different modules, because instance declarations implicitly leak out of a module when anything 
in the module is used by another module. This example would be illegal in C++0X concept exten- 
sion because 1) model definitions must appear in the same namespace as their concept, and 2) if 
placed in the same namespace, the two model definitions would violate the one-definition-rule. 

It is also quite possible for separately developed modules to include model definitions that 
accidentally overlap. In Q, this is not a problem, as the model definitions will each apply within 
their own module. Model definitions may be explicitly imported from one module to another. 
The syntax for modules and import declarations is shown below. An interesting extension would 
be parameterized modules, but we leave that for future work. 

decl <— module mid { decl ...}// module 
| scope mid = scope; // scope alias 
| import scope . cid<type ,...> ; // import model 
| public : decl . . . // public region 
| private : decl . . . // private region 

3.6.2. Constrained models 

In Q, a model definition may itself be parameterized and the type parameters constrained by a 
where clause. Fig. 13 shows a typical example of a parameterized model. The model definition 
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Fig. 13. Example of parameterized model definition. 



I 

concept Comparable<T> { 

fun operator==(T,T)->bool@; 

>; 

model Comparable<int> { }; 

struct list<T> { /*...*/ }; 

model <T> where { Comparable<T> } 
Comparable< list<T> > { 

fun operator==(list<T> x, list<T> y) -> bool@ { A...*/ } 

>; 

fun generic_f oo<C> where { Comparable<C> } (C a, C b) -> boolO 
{ return a == b; } 

fun mainO -> int@ { 

let 11 = @list<int>() ; let 12 = @list<int> () ; 
generic_foo(ll,12) ; 
return 0; 

} 

i 



in the example says that for any type T, list<T> is a model of Comparable if T is a model of 
Comparable. Thus, a model definition is like an inference rule or a Horn clause [48] in logic 
programming. For example, a model definition of the form 

model <Tl,...,Tn> where {PI, Pn } 

Q { ... >; 

corresponds to the Horn clause: 

(Pi and . . . and P n ) implies Q 

The model definitions from the example in Fig. 13 could be represented in Prolog with the 
following two rules: 

comparable (int) . 

comparable (list (T) ) :- comparable (T) . 

The algorithm for model lookup is essentially a logic programming engine: it performs uni- 
fication and backward chaining (similar to how instance lookup is performed in Haskell). Uni- 
fication is used to determine when the head of a model definition matches. For example, in 
Fig. 13, in the call to generic_f oo the constraint Comparable< list<int> > needs to be sat- 
isfied. There is a model definition for Comparable< list<T> > and unification of list<int> 
and list<T> succeeds with the type assignment T = int. However, we have not yet satis- 
fied Comparable< list<int> > because the where clause of the parameterized model must 
also be satisfied. The model lookup algorithm therefore proceeds recursively and tries to satisfy 
Comparable<int>, which in this case is trivial. This process is called backward chaining: it 
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starts with a goal (a constraint to be satisfied) and then applies matching rules (model definitions) 
to reduce the goal into subgoals. Eventually the subgoals are reduced to facts (model definitions 
without a where clause) and the process is complete. As is typical of Prolog implementations, Q 
processes subgoals in a depth-first manner. 

It is possible for multiple model definitions to match a constraint. When this happens the most 
specific model definition is used, if one exists. Otherwise the program is ill-formed. We say that 
definition A is a more specific model than definition B if the head of A is a substitution instance 
of the head of B and if the where clause of B implies the where clause of A. In this context, 
implication means that for every constraint c in the where clause of A, c is satisfied in the current 
environment augmented with the assumptions from the where clause of B. 

Q places very few restrictions on the form of a model definition. The only restriction is that all 
type parameters of a model must appear in the head of the model. That is, they must appear in 
the type arguments to the concept being modeled. For example, the following model definition 
is ill formed because of this restriction. 

concept C<T> { >; 

model <T,U> C<T> { }; // 'ill formed, U is not in an argument to C 

This restriction ensures that unifying a constraint with the model head always produces assign- 
ments for all the type parameters. 

Horn clause logic is by nature powerful enough to be Turning-complete. For example, it is 
possible to express general recursive functions. The program in Fig. 14 computes the Acker- 
mann function at compile time by encoding it in model definitions. This power comes at a price: 
determining whether a constraint is satisfied by a set of model definitions is in general unde- 
cidable. Thus, model lookup is not guaranteed to terminate and programmers must take some 
care in writing model definitions. We could restrict the form of model definitions to achieve de- 
cidability however there are two reasons not to do so. First, restrictions would complicate the 
specification of Q and make it harder to learn. Second, there is the danger of ruling out useful 
model definitions. 



3.7. Improved error messages 

In the introduction we showed how users of generic libraries in C++ are plagued by hard to 
understand error messages. The introduction of concepts and where clauses in Q solves this 
problem. The following is the same misuse of the stable_sort function, but this time written 
in Q. 

4 fun mainO -> int@{ 

5 let v = @list<int>() ; 

6 stable_sort (begin(v) , end(v)); 

7 return ; 

8 } 

In contrast to long C++ error message (Fig. 1), in Q we get the following: 

test/ stable_sort_error . hie : 6 : 

In application stable_sort (begin(v) , end(v)), 

Model MutableRandomAccessIterator<mutable_list_iter<int>> 

needed to satisfy requirement, but it is not defined. 
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Fig. 14. The Ackermann function encoded in model definitions. 



I I 
struct zero { }; 
struct suc<n> { }; 
concept Ack<x,y> { type result; }; 

model <y> Ack<zero,y> { type result = suc<y>; }; 

model <x> where { Ack<x, suc<zero> > } 

Ack<suc<x>, zero> { type result = Ack<x, suc<zero> >. result; }; 

model <x,y> where { Ack<suc<x> ,y> , Ack<x, Ack<suc<x>,y> .result > } 
Ack< suc<x> , suc<y> > { 

type result = Ack<x, Ack<suc<x>,y>. result >. result; 

>; 

fun foo(int) { } 
fun mainO -> int@ { 

type two = suc< suc<zero> >; type three = suc<two>; 

f oo (@Ack<two,three>. result ()) ; 

// error: Type (suc<suc<suc<suc<suc<suc<suc<suc<suc<zero»»>»>>) 
// does not match type (int) 

} 

i i 

A related problem that plagues authors of generic C++ libraries is that type errors often go 
unnoticed during library development. Again, this is because C++ delays type checking templates 
until instantiation. One of the reasons for such type errors is that the implementation of a template 
is not consistent with its documented type requirements. 

This problem is directly addressed in Q: the implementation of a generic function is type- 
checked with respect to its where clause, independently of any instantiations. Thus, when a 
generic function successfully compiles, it is guaranteed to be free of type errors and the imple- 
mentation is guaranteed to be consistent with the type requirements in the where clause. 

Interestingly, while implementing the STL in Q, the type checker caught several errors in the 
STL as defined in C++. One such error was in replace_copy. The implementation below was 
translated directly from the GNU C++ Standard Library, with the where clause matching the 
requirements for replace_copy in the C++ Standard [49]. 

196 fun replace_copy<Iterl , Iter2 , T> 

197 where { InputIterator<Iterl> , Regular<T>, EqualityComparable<T> , 

198 0utputlterator<lter2 , InputIterator<Iterl> . value> , 

199 0utputlterator<lter2, T> , 

200 EqualityComparable2<InputIterator<Iterl> . value, T> } 

201 (Iterl® first, Iterl last, Iter2@ result, T old, T neu) -> Iter2@ { 

202 for ( ; first != last; ++first) 

203 result << *first == old ? neu : *first; 

204 return result ; 

205 } 
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The Q compiler gives the following error message: 
stl/sequence_mutation . hie : 203 : 

The two branches of the conditional expression must have the 
same type or one must be coercible to the other. 

This is a subtle bug, which explains why it has gone unnoticed for so long. The type requirements 
say that both the value type of the iterator and T must be writable to the output iterator, but the 
requirements do not say that the value type and T are the same type, or coercible to one another. 

3.8. Generic classes, structs, and unions 

The syntax for generic classes, structs, and unions is defined below. The grammar variable 
did is for class, struct, and union names. 

deel <— class did polyhdr { classmem ... }; 
ded <— struct did polyhdr { mem ... } ; 
deel «— union did polyhdr { mem ... } ; 
mem <— type id ; 
classmem «— mem 

| polyhdr did (type pass [id], ...) { stmt ... } 

| ~ did O ■[ stmt . . . y 
polyhdr^- [<tyid ,...>] [where { constraint, . . . }] 

Classes consist of data members, constructors, and a destructor. There are no member func- 
tions; normal functions are used instead. Data encapsulation (public/private) is specified at 
the module level instead of inside the class. Class, struct, and unions are used as types using 
the syntax below. Such a type is well-formed if the type arguments are well-formed and if the 
requirements in its where clause are satisfied. 

type <— clid[<type , ...>] 

3.9. Type equality 

There are several language constructions in Q that make it difficult to decide when two types 
are equal. Generic functions complicate type equality because the names of the type parameters 
do not matter. So, for example, the following two function types are equal: 

fun<T>(T)->T = fun<U>(U)->U 

The order of the type parameters does matter (because a generic function may be explicitly 
instantiated) so the following two types are not equal. 

fun<S,T>(S,T)->T ^ fun<T,S>(S,T)->T 

Inside the scope of a generic function, type parameters with different names are assumed to be 
different types (this is a conservative assumption). So, for example, the following program is ill 
formed because variable a has type S whereas function f is expecting an argument of type T. 

fun foo<S, T>(S a, fun(T)->T f) -> T { return f(a); } 

Associated types and same-type constraints also affect type equality. First, if there is a model 
definition in the current scope such as: 

model C<int> { type bar = bool; }; 
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then we have the equality C<int> .bar = bool. 

Inside the scope of a generic function, same-type constraints help determine when two types 
are equal. For example, the following version of f oo is well formed: 

fun foo_KT, S> where { T == S } (fun(T)->T f, S a) -> T { return f(a); } 

There is a subtle difference between the above version of f oo and the following one. The reason 
for the difference is that same-type constraints are checked after type argument deduction. 

fun f oo_2<T>(fun(T)->T f , T a) -> T { return f(a); } 

fun id (double x) -> double { return x; } 

fun mainO -> int@ { 
foo_l(id, 1.0) ; //ok 

f oo_l (id, 1) ; // error: Same type requirement violated, double != int 
foo_2(id, 1.0) ; //ok 
foo_2(id, 1); //ok 

} 

In the first call to f oo_l the compiler deduces T=double and S=double from the arguments 
id and 1 . 0. The compiler then checks the same-type constraint T == S, which in this case is 
satisfied. For the second call to f oo_l, the compiler deduces T=double and S=int and then the 
same-type constraint T == S is not satisfied. The first call to f oo_2 is straightforward. For the 
second call to f oo_2, the compiler deduces T=double from the type of id and the argument 1 is 
implicitly coerced to double. 

Type equality is a congruence relation, which means several things. First it means type equality 
is an equivalence relation, so it is reflexive, transitive, and symmetric. Thus, for any types p, a, 
and r we have 

- T = T 

- a = t implies r = a 

- p = a and a = r implies p = r 

For example, the following function is well formed: 

fun foo<R,S,T> where { R == S, S == T} 
(fun(T)->S f, R a) -> T { return f(a); } 

The type expression R (the type of a) and the type expression T (the parameter type of f ) both 
denote the same type. 

The second aspect of type equality being a congruence is that it propagates in certain ways 
with respect to type constructors. For example, if we know that S = T then we also know that 
f un(S) ->S = f un(T)->T. Similarly, if we have defined a generic struct such as: 

struct bar<U> { }; 

then S = T implies bar<S> = bar<T>. The propagation of equality also goes in the other direc- 
tion. For example, bar<S> = bar<T> implies that S = T. The congruence extends to associated 
types. So S = T implies C<S> .bar = C<T> .bar. However, for associated types, the propagation 
does not go in the reverse direction. So C<S> . bar = C<T> . bar does not imply that S = T. For 
example, given the model definitions 

model C<int> { type bar = bool; }; 
model C<float> { type bar = bool; }; 

we have C<int> .bar = C<f loat> .bar but this does not imply that int = float. 
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Like type parameters, associated types are in general assumed to be different from one another. 
So the following program is ill-formed: 

concept C<U> { type bar; }; 

fun foo<S, T> where { C<S>, C<T> } (C<S>.bar a, fun(C<T>.bar)->T f) -> T 
{ return f (a) ; } 

The next program is also ill formed. 

concept D<U> { type bar; type zow; }; 

fun foo<T> where { D<T> } (D<T>.bar a, fun(D<T>.zow)->T f) -> T 
{ return f (a) ; } 

In the compiler for Q we use the congruence closure algorithm by Nelson and Oppen [50] to 
keep track of which types are equal. The algorithm is efficient: 0(n\ogn) time complexity on 
average, where n is the number of types. It has 0(n 2 ) time complexity in the worst case. This 
can be improved by instead using the Downey-Sethi-Tarjan algorithm which is 0(nlog n) in the 
worst case [51]. 

3.10. Function overloading and concept-based overloading 

Multiple functions with the same name may be defined and static overload resolution is per- 
formed to decide which function to invoke at a particular call site. The resolution depends on 
the argument types and on the model definitions in scope. When more than one overload may be 
called, the most specific overload is called if one exists. The basic overload resolution rules are 
based on those of C++. 

In the following simple example, the second f oo is called. 

fun foo() -> int@ { return -1; } 
fun foo(int x) -> int(§ { return 0; } 
fun f oo (double x) -> intQ { return -1; } 
fun foo<T>(T x) -> int@ { return -1; } 

fun mainO -> int@ { return foo(3); } 

The first f oo has the wrong number of arguments, so it is immediately dropped from consid- 
eration. The second and fourth are given priority over the third because they can exactly match 
the argument type int (for the fourth, type argument deduction results in T=int), whereas the 
third f oo requires an implicit coercion from int to double. The second f oo is favored over the 
fourth because it is more specific. 

A function / is a more specific overload than function g if g is callable from / but not vice 
versa. A function g is callable from function / if you could call g from inside /, forwarding all 
the parameters of / as arguments to g, without causing a type error. More formally, if / has type 
f un<iy>whereC/(<7y)->Tf and g has type f un<t 5 >whereC g (o^")->T 9 then g is callable from / 
if 

07 < [tg/p^Vg and Cf implies [t g /p]C g 

for some p. 

In general there may not be a most specific overload in which case the program is ill-formed. 
In the following example, both f oo's are callable from each other and therefore neither is more 
specific. 

fun f 00 (double x) -> int® { return 1; } 
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Fig. 15. The advance algorithms using concept-based overloading. 



I 

fun advance<Iter> where { InputIterator<Iter> } 
(Iter! i, InputIterator<Iter> . difference® n) { 
for (; n != zeroO; — n) 
++i; 

} 

fun advance<Iter> where { BidirectionalIterator<Iter> } 
(Iter! i, InputIterator<Iter> . difference® n) { 
if (zeroO < n) 

for (; n != zeroO; — n) 
++i; 

else 

for (; n != zeroO; ++n) 

— i; 

} 

fun advance<Iter> where { RandomAccessIterator<Iter> } 
(Iter! i, InputIterator<Iter> . difference® n) { 
i = i + n; 

} 

i 



fun f oo (float x) -> int® { return -1; } 
fun mainO -> int® { return foo(3); } 

In the next example, neither f oo is callable from the other so neither is more specific. 

fun foo<T>(T x, int y) -> int® { return 1; } 
fun foo<T>(int x, T y) -> int® { return -1; } 
fun mainO -> int® { return foo(3, 4); } 

In Section 2.5 we showed how to accomplish concept-based overloading of several versions of 
advance using the tag dispatching idiom in C++. Fig. 15 shows three overloads of advance im- 
plemented in Q. The signatures for these overloads are the same except for their where clauses. 
The concept Bidirectionallterator is a refinement of Input Iterator, so the second ver- 
sion of advance is more specific than the first. The concept RandomAccessIterator is a re- 
finement of Bidirectionallterator, so the third advance is more specific than the second. 

The code in Fig. 16 shows two calls to advance. The first call is with an iterator for a singly- 
linked list. This iterator is a model of Inputlterator but not RandomAccessIterator; the 
overload resolution chooses the first version of advance. The second call to advance is with a 
pointer which is a RandomAccessIterator so the second version of advance is called. 

Concept-based overloading in Q is entirely based on static information available during the 
type checking and compilation of the call site. This presents some difficulties when trying to 
resolve to optimized versions of an algorithm from within another generic function. Section ?? 
discusses the issues that arise and presents an idiom that ameliorates the problem. 

3.11. Function expressions 

The following is the syntax for function expressions and function types. 



29 



Fig. 16. Example calls to advance and overload resolution. 



I 

use "slist.g"; 

use "basic_algorithms . g" ; // for copy 

use " iterator_f unctions. g" ; // for advance 

use "iterator_models .g" ; // for iterator models for int* 

fun mainO -> int@ { 

let si = @slist<int>() ; 
push_f ront (1 , si); push_front(2, si); 
push_front(3, si); push_front(4, si); 
let in_iter = begin(sl) ; 

advance (in_iter , 2) ; // calls version 1, linear time 

let rand_iter = new int [4] ; 

copy (begin(sl) , end(sl) , rand_iter) ; 

advance (rand_iter, 2) ; // calls version 3, constant time 

if (*in_iter == *rand_iter) return 0; 
else return -1; 

} 

i 



The body of a function expression may be either a sequence of statements enclosed in braces or 
a single expression following a colon. The return type of a function expression is deduced from 
the return statements in the body, or from the single expression. 

The following example computes the sum of an array using f or_each and a function expres- 
sion. 2 

fun mainO -> int@ { 
let n = 8; 
let a = new int [n] ; 
for (let i = 0; i != n; ++i) 

a[i] = i; 
let sum = 0; 

for_each(a, a + n, fun(int x) p=&sum { *p = *p + x; }) ; 
return sum - (n * (n-l))/2; 

} 

The expression 

fun(int x) p=&sum { *p = *p + x; } 

creates a function object. The body of a function expression is not lexically scoped, so a direct use 
of sum in the body would be an error. The initialization p=&sum declares a data member inside 
the function object with type int* and copy constructs the member with the address fesum. 



2 Of course, the accumulate function is the appropriate algorithm for this computation, but then the example would 
not demonstrate the use of function expressions. 
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The primary motivation for non-lexically scoped function expressions is to keep the design 
close to C++ so that function expressions can be directly compiled to function objects in C++. 
However, this design has some drawbacks as we discovered while porting the STL to Q. 

Most STL implementations implement two separate versions of f ind_subsequence, one 
written in terms of operator== and the other in terms of a function object. The version us- 
ing operator== could be written in terms of the one that takes a function object, but it is not 
written that way. The original reason for this was to improve efficiency, but with with a modern 
optimizing compiler there should be no difference in efficiency: all that is needed to erase the 
difference is some simple inlining. The Q implementation we write the operator== version of 
f ind_subsequence in terms of the higher-order version. The following code shows how this is 
done and is a bit more complicated than we would have liked. 

fun f ind_subsequence<Iterl , Iter2> 

where { ForwardIterator<Iterl> , ForwardIterator<Iter2> , 

ForwardIterator<Iterl> .value == ForwardIterator<Iter2> . value , 
EqualityComparable<ForwardIterator<Iter 1> . value> } 

(Iterl firstl, Iterl lastl, Iter2 first2, Iter2 last2) -> IterlQO 

{ 

type T = ForwardIterator<Iterl> . value ; 
let cmp = model EqualityComparable<T> . operator==; 
return find_subsequence (firstl , lastl, first2, last2, 

fun(T a,T b) c=cmp: c(a, b)); 

} 

It would have been simpler to write the function expression as 

fun(T a, T b) : a == b 

However, this is an error in Q because the operator== from the EqualityComparable< . .> 
requirement is a local name, not a global one, and is therefore not in scope for the body of the 
function expression. The workaround is to store the comparison function as a data member of 
the function object. The expression 

model EqualityComparable<T> . operator== 

accesses the operator== member from the model of EqualityComparable for type T. 

Examples such as these are a convincing argument that lexical scoping should be allowed in 
function expressions, and the next generation of Q will support this feature. 

3.12. First-class polymorphism 

In the introduction we mentioned that Q is based on System F. One of the hallmarks of System 
F is that it provides first class polymorphism. That is, polymorphic objects may be passed to and 
returned from functions. This is in contrast to the ML family of languages, where polymorphism 
is second class. In Section 3.5 we discussed how the restriction to second-class polymorphism 
simplifies type argument deduction, reducing it to normal unification. However, we prefer to 
retain first-class polymorphism and use the somewhat more complicated variant of unification 
from ML F . 

One of the reasons to retain first-class polymorphism is to retain the expressiveness of function 
objects in C++. A function object may have member function templates and may therefore by used 
polymorphically. The following program is a simple use of first-class polymorphism in Q. Note 
that f is applied to arguments of different types. 
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Fig. 17. Some STL Algorithms in Q. 



fun find<Iter> where { InputIterator<Iter> } 
(Iter® first, Iter last, 

f un(InputIterator<Iter> . value) ->bool@ pred) -> Iter® { 
while (first != last and not pred(*f irst) ) ++first; 
return first; 

} 

fun find<Iter> where { InputIterator<Iter> , 

EqualityComparable<InputIterator<Iter> . value> } 
(Iter® first, Iter last, InputIterator<Iter> . value value) -> Iter® { 

while (first != last and not (*first == value)) ++first; 

return first; 

} 

fun remove<Iter> where { MutableForwardIterator<Iter> , 

EqualityComparable<InputIterator<Iter> . value> } 
(Iter® first, Iter last, InputIterator<Iter> . value value) -> Iter® { 

first = find(first, last, value); 

let i = ®Iter(f irst) ; 

return first == last ? first : remove_copy(++i, last, first, value); 

} 



fun foo(fun<T>(T)->T f) -> int® { return f(l) + d2i (f (-1 . 0) ) ; } 

fun id<T>(T x) -> T { return x; } 

fun mainO -> int® { return foo(id); } 



4. Analysis of Q and the STL 

In this section we analyze the interdependence of the language features of Q and generic li- 
brary design in light of implementing the STL. A primary goal of generic programming is to 
express algorithms with minimal assumptions about data abstractions, so we first look at how the 
generic functions of Q can be used to accomplish this. Another goal of generic programming is 
efficiency, so we investigate the use of function overloading in Q to accomplish automatic algo- 
rithm selection. We conclude this section with a brief look at implementing generic containers 
and adaptors in Q. 

4.1. Algorithms 

Fig. 17 depicts a few simple STL algorithms implemented using generic functions in Q. The 
STL provides two versions of most algorithms, such as the overloads for find in Fig. 17. The 
first version is higher-order, taking a predicate function as its third parameter while the sec- 
ond version relies on operator==. Functions are first-class in Q, so the higher-order version 
is straightforward to express. As is typical in the STL, there is a high-degree of internal reuse: 
remove uses remove_copy and and find. 
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Fig. 18. The STL Iterator Concepts in Q (Part I). 

1 

concept InputIterator<X> { 

type value ; 

type difference; 

refines EqualityComparable<X> ; 

refines Regular<X>; 

require SignedlntegraKdif f erence> ; 

fun operator* (X) -> value®; 

fun operator++(X! ) -> X!; 

>; 

concept OutputIterator<X,T> { 
refines Regular<X>; 
fun operator«(X! , T) -> X!; 

>; 

concept ForwardIterator<X> { 

refines Def aultConstructible<X> ; 
refines InputIterator<X> ; 
fun operator* (X) -> value; 

>; 

concept MutableForwardIterator<X> { 
refines ForwardIterator<X> ; 
refines OutputIterator<X,value>; 
require Regular<value> ; 
fun operator* (X) -> value!; 

>; 

i 



4.2. Iterators 

Figures 18 and 19 show the STL iterator hierarchy as represented in Q. Required operations 
are expressed in terms of function signatures, and associated types are expressed with a nested 
type requirement. The refinement hierarchy is established with the refines clauses and nested 
model requirements with require. The semantic invariants and complexity guarantees of the 
iterator concepts are not expressible in Q as they are beyond the scope of its type system. 

4.3. Automatic Algorithm Selection 

To realize the generic programming efficiency goals, Q provides mechanisms for automatic 
algorithm selection. The following code shows two overloads for copy. (We omit the third over- 
load to save space.) The first version is for input iterators and the second for random access, 
which uses an integer counter thereby allowing some compilers to better optimize the loop. The 
two signatures are the same except for the where clause. 

fun copy<Iterl , Iter2> where { InputIterator<Iterl> , 

0utputlterator<lter2, InputIterator<Iterl> . value> } 
(IterlO first, Iterl last, Iter2@ result) -> Iter2@ { 
for (; first != last; ++first) result << *first; 
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Fig. 19. The STL Iterator Concepts in Q (Part II). 

1 

concept BidirectionalIterator<X> { 

refines ForwardIterator<X> ; 

fun operator— (X! ) -> X!; 

>; 

concept MutableBidirectionalIterator<X> { 
refines BidirectionalIterator<X> ; 
refines MutableForwardIterator<X> ; 

>; 

concept RandomAccessIterator<X> { 
refines BidirectionalIterator<X> ; 
refines LessThanComparable<X> ; 
fun operator+(X, difference) -> X@; 
fun operator- (X, difference) -> X@; 
fun operator- (X, X) -> difference®; 

>; 

concept MutableRandomAccessIterator<X> { 
refines RandomAccessIterator<X> ; 
refines MutableBidirectionalIterator<X> ; 

>; 

i 



return result ; 

} 

fun copy<Iterl , Iter2> where { RandomAccessIterator<Iterl> , 

0utputlterator<lter2, InputIterator<Iterl> . value> } 
(Iterl® first, Iterl last, Iter2@ result) -> Iter2@ { 

for (n = last - first; n > zeroO; — n, ++first) result << *first; 

return result ; 

} 

The use of dispatching algorithms such as copy inside other generic algorithms is challenging 
because overload resolution is based on the surrogate models from the where clause and not 
on models defined for the instantiating type arguments. (This rule is needed for separate type 
checking and compilation). Thus, a call to an overloaded function such as copy may resolve to a 
non-optimal overload. Consider the following implementation of merge. The Iterl and Iter2 
types are required to model Inputlterator and the body of merge contains two calls to copy. 

fun merge<Iterl , Iter2 , Iter3> 

where { InputIterator<Iterl> , InputIterator<Iter2> , 

LessThanComparable<InputIterator<Iterl> . value> , 
InputIterator<Iterl> . value == InputIterator<Iter2> . value , 
0utputlterator<lter3 , InputIterator<Iterl> . value> } 
(Iterl® firstl, Iterl lastl, Iter2@ first2, Iter2 last2, Iter3@ result) 
-> Iter3@ { . . . 
return copy(first2, last2, copy(firstl, lastl, result)); 

} 
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This merge function always calls the slow version of copy even though the actual iterators may 
be random access. In C++, with tag dispatching, the fast version of copy is called because the 
overload resolution occurs after template instantiation. However, C++ does not have separate type 
checking for templates. 

To enable dispatching for copy, the type information at the instantiation of merge must be 
carried into the body of merge (suppose it is instantiated with a random access iterator). This 
can be done with a combination of concept and model declarations. First, define a concept with 
a single operation that corresponds to the algorithm. 

concept CopyRange<Il , I2> { 

fun copy_range(Il,Il,I2) -> 120; 

>; 

Next, add a requirement for this concept to the type requirements of merge and replace the calls 
to copy with the concept operation copy_range. 

fun merge<Iterl , Iter2 , Iter3> 

where { CopyRange<Iter2 , Iter3> , CopyRange<Iterl , Iter3> } 

(Iterl® firstl, Iterl lastl, Iter2Q first2, Iter2 last2, Iter3@ result) 
-> Iter3@ { . . . 

return copy_range (f irst2 , last2, copy_range (firstl , lastl, result)); 

} 

The final step of the idiom is to create parameterized model declarations for Copy Range. The 
where clauses of the model definitions match the where clauses of the respective overloads 
for copy. In the body of each copy_range there is a call to copy which will resolve to the 
appropriate overload. 

model <Iterl , Iter2> where { InputIterator<Iterl> , 

0utputlterator<lter2, InputIterator<Iterl> . value> } 

CopyRange<Iterl , Iter2> { 

fun copy_range (Iterl first, Iterl last, Iter2 result) -> Iter2@ 
{ return copy (first, last, result); } 

>; 

model <Iterl , Iter2> where { RandomAccessIterator<Iterl> , 
0utputlterator<lter2, InputIterator<Iterl> . value> } 

CopyRange<Iterl , Iter2> { 

fun copy_range (Iterl first, Iterl last, Iter2 result) -> Iter2@ 
{ return copy (first, last, result); } 

>; 

A call to merge with a random access iterator will use the second model to satisfy the re- 
quirement for CopyRange. Thus, when copy_range is invoked inside merge, the fast version of 
copy is called. A nice property of this idiom is that calls to generic algorithms need not change. A 
disadvantage of this idiom is that the interface of the generic algorithms becomes more complex. 

4.4. Containers 

The containers of the STL are implemented in Q using polymorphic classes. Fig. 20 shows an 
excerpt of the doubly-linked list container in Q. As usual, a dummy sentinel node is used in the 
implementation. With each STL container comes iterator types that translate between the uni- 
form iterator interface and data-structure specific operations. Fig. 20 shows the list_iterator 
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Fig. 20. Excerpt from a doubly-linked list container in Q. 

1 

struct list_node<T> where { Regular<T>, Def aultConstructible<T> } { 
list_node<T>* next; list_node<T>* prev; T data; 

>; 

class list<T> where { Regular<T>, Def aultConstructible<T> } { 
listO : n(new list_node<T> () ) { n->next = n; n->prev = n; } 
~list() { ... } 
list_node<T>* n; 

>; 

class list_iterator<T> where { Regular<T>, Def aultConstructible<T> } { 
... list_node<T>* node; 

>; 

fun operator*<T> where { Regular<T>, Def aultConstructible<T> } 
(list_iterator<T> x) -> T { return x.node->data; } 

fun operator++<T> where { Regular<T>, Def aultConstructible<T> } 
(list_iterator<T> ! x) -> list_iterator<T> ! 
{ x.node = x.node->next; return x; } 

fun begin<T> where { Regular<T>, Def aultConstructible<T> } 
(list<T> 1) -> list_iterator<T>@ 

{ return @list_iterator<T>(l.n->next) ; } 

fun end<T> where { Regular<T>, Def aultConstructible<T> } 

(list<T> 1) -> list_iterator<T>@ { return @list_iterator<T>(l.n) ; } 

i 



which implements operator* in terms of x . node->data and implements operator++ by per- 
forming the assignment x . node = x.node->next. 

Not shown in Fig. 20 is the implementation of the mutable iterator for list (the list_iterator 
provides read-only access). The definitions of the two iterator types are nearly identical, the only 
difference is that operator* returns by read-only reference for the constant iterator whereas it 
returns by read-write reference for the mutable iterator. The code for these two iterators should 
be reused but Q does not yet have a language mechanism for this kind of reuse. 

In C++ this kind of reuse can be expressed using the Curiously Recurring Template Pattern 
(CRTP) [52] and by parameterizing the base iterator class on the return type of operator*. This 
approach can not be used in Q because the parameter passing mode may not be parameterized. 
Further, the semantics of polymorphism in Q does not match the intended use here, we want 
to generate code for the two iterator types at library construction time. A separate generative 
mechanism is needed to complement the generic features of Q. As a temporary solution, we used 
the m4 macro system to factor the common code from the iterators. The following is an excerpt 
from the implementation of the iterator operators. 

define ( 1 f orward_iter_ops ' , 

'fun operator*<T> where { Regular<T>, Def aultConstructible<T> } 
($1<T> x) -> T $2 { return x.node->data; } ...') 
f orward_iter_ops(list_iterator, &) A readonly */ 
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f orward_iter_ops(mutable_list_iter, !) /* readwrite */ 
4.5. Adaptors 

The reverse_iterator class is a representative example of an STL adaptor. 

class reverse_iterator<Iter> 

where { Regular<Iter> , Def aultConstructible<Iter> } 

{ 

reverse_iterator (Iter base) : curr(base) { } 

reverse_iterator (reverse_iterator<Iter> other) : curr (other . curr) { } 
Iter curr; 

>; 

The Regular requirement on the underlying iterator is needed for the copy constructor and 
Def aultConstructible is needed for the default constructor. This adaptor flips the direction 
of traversal of the underlying iterator, which is accomplished with the following operator* and 
operator++. There is a call to operator — on the underlying Iter type we need to include the 
requirement for Bidirectional Iterator. 

fun operator*<Iter> where { BidirectionalIterator<Iter> } 
(reverse_iterator<Iter> r) -> BidirectionalIterator<Iter> . value 
{ let tmp = Olter (r . curr) ; return * — tmp; } 

fun operator++<Iter> where { BidirectionalIterator<Iter> } 
(reverse_iterator<Iter> ! r) -> reverse_iterator<Iter> ! 
{ — r . curr ; return r ; } 

Polymorphic model definitions are used to establish that reverse_iterator is a model of the 
iterator concepts, as we discussed in Section 3.2. 

5. The Boost Graph Library 

A group of us at the Open Systems Lab performed a comparative study of language sup- 
port for generic programming [17]. We evaluated a half dozen modern programming languages 
by implementing a subset of the Boost Graph Library [13] in each language. We implemented 
a family of algorithms associated with breadth-first search, including Dijkstra's single-source 
shortest paths [53] and Prim's minimum spanning tree algorithms [54]. This section extends the 
previous study to include Q. We give a brief overview of the BGL, describe the implementation 
of the BGL in Q, and compare the results to those in our earlier study [17]. 

5.1. An overview of the BGL graph search algorithms 

Figure 21 depicts some graph search algorithms from the BGL, their relationships, and how 
they are parameterized. Each large box represents an algorithm and the attached small boxes 
represent type parameters. An arrow labeled <uses> from one algorithm to another specifies 
that one algorithm is implemented using the other. An arrow labeled <models> from a type 
parameter to an unboxed name specifies that the type parameter must model that concept. For 
example, the breadth-first search algorithm has three type parameters: G, C, and Vis. Each of 
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these has requirements: G must model the Vertex List Graph and Incidence Graph concepts, C 
must model the Read/Write Map concept, and Vis must model the BFS Visitor concept. The 
breadth-first search algorithm is implemented using the graph search algorithm. 
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Fig. 21. Graph algorithm parameterization and reuse within the Boost Graph Library. Arrows for redundant models 
relationships are not shown. For example, the type parameter G of breadth-first search must also model Incidence Graph 
because breadth-first search uses graph search. 

The core algorithm of this library is graph search, which traverses a graph and performs user- 
defined operations at certain points in the search. The order in which vertices are visited is con- 
trolled by a type argument, B, that models the Bag concept. This concept abstracts a data structure 
with insert and remove operations but no requirements on the order in which items are removed. 
When B is bound to a FIFO queue, the traversal order is breadth-first. When it is bound to a 
priority queue based on distance to a source vertex, the order is closest-first, as in Dijkstra's 
single-source shortest paths algorithm. Graph search is also parameterized on actions to take at 
event points during the search, such as when a vertex is first discovered. This parameter, Vis, 
must model the Visitor concept (which is not to be confused with the Visitor design pattern). The 
graph search algorithm also takes a type parameter C for mapping each vertex to a color and C 
must model the Read/Write Map concept. The colors are used as markers to keep track of the 
progression of the algorithm through the graph. 

The Read Map and Read/Write Map concepts represent variants of an important abstraction 
in the graph library: the property map. In practice, graphs represent domain-specific entities. 
For example, a graph might depict the layout of a communication network, its vertices repre- 
senting endpoints and its edges representing direct links. In addition to the number of vertices 
and the edges between them, a graph may associate values to its elements. Each vertex of a 
communication network graph might have a name and each edge a maximum transmission rate. 
Some algorithms require access to domain information associated with the graph representation. 
For example, Prim's minimum spanning tree algorithm requires "weight" information associated 
with each edge in a graph. Property maps provide a convenient implementation-agnostic means 
of expressing, to algorithms, relations between graph elements and domain-specific data. Some 
graph data structures directly contain associated values with each node; others use external asso- 
ciative data structures to implement these relationships. Interfaces based on property maps work 
equally well with both representations. 

The graph algorithms are all parameterized on the graph type. Breadth-first search takes a 
type parameter G, which must model two concepts, Incidence Graph and Vertex List Graph. The 
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Incidence Graph concept defines an interface for accessing out-edges of a vertex. Vertex List 
Graph specifies an interface for accessing the vertices of a graph in an unspecified order. The 
Bellman-Ford shortest paths algorithm [55] requires a model of the Edge List Graph concept, 
which provides access to all the edges of a graph. 

That graph capabilities are partitioned among three concepts illustrates generic programming's 
emphasis on minimal algorithm requirements. The Bellman-Ford shortest paths algorithm re- 
quires of a graph only the operations described by the Edge List Graph concept. Breadth-first 
search, in contrast, requires the functionality of two separate concepts. By partitioning the func- 
tionality of graphs, each algorithm can be used with any data type that meets its minimum re- 
quirements. If the three fine-grained graph concepts were replaced with one monolithic concept, 
each algorithm would require more from its graph type parameter than necessary and would thus 
unnecessarily restrict the set of types with which it could be used. 

The graph library design is suitable for evaluating generic programming capabilities of lan- 
guages because its implementation involves a rich variety of generic programming techniques. 
Most of the algorithms are implemented using other library algorithms: breadth-first search and 
Dijkstra's shortest paths use graph search, Prim's minimum spanning tree algorithm uses Dijk- 
stra's algorithm, and Johnson's all-pairs shortest paths algorithm [56] uses both Dijkstra's and 
Bellman-Ford shortest paths. Furthermore, type parameters for some algorithms, such as the 
G parameter to breadth-first search, must model multiple concepts. In addition, the algorithms 
require certain relationships between type parameters. For example, consider the graph search 
algorithm. The C type argument, as a model of Read/Write Map, is required to have an associ- 
ated key type. The G type argument is required to have an associated vertex type. Graph search 
requires that these two types be the same. 

As in our earlier study, we focus the evaluation on the interface of the breadth-first search 
algorithm and the infrastructure surrounding it, including concept definitions and an example 
use of the algorithm. 

5.2. Implementation in Q 

So far we have implemented breadth-first search and Dijkstra's single-source shortest paths in 
Q. This required defining several of the graph and property map concepts and an implementation 
of the adjacency _list class, a FIFO queue, and a priority queue. 

The interface for the breadth-first search algorithm is straightforward to express in Q. It has 
three type parameters: the graph type G, the color map type C, and the visitor type Vis. The 
requirements on the type parameters are expressed with a where clause, using concepts that 
we describe below. In the interface of breadth_f irst_search, associated types and same-type 
constraints play an important role in accurately tracking the relationships between the graph type, 
its vertex descriptor type, and the color property map. 

type Color = int ; 
let black = 0; 
let gray = 1; 
let white = 2; 

fun breadth_f irst_search<G, C, Vis> 

where { IncidenceGraph<G> , VertexListGraph<G> , 
ReadWritePropertyMap<C> , 

PropertyMap<C> .key == IncidenceGraph<G> . vertex_descriptor , 
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PropertyMap<C> . value == Color, 
BFSVisitor<Vis,G> } 
(G g, IncidenceGraph<G> . vertex_descriptor@ s, C c, Vis vis) { A...*/ } 

Figure 22 shows the definition of several graph concepts in Q. The Graph concept requires the 
associated types vertex_descriptor and edge_descriptor and some basic functionality for 
those types such as copy construction and equality comparison. This concept also includes the 
source and target functions. The Graph concept serves to factor common requirements out of 
the IncidenceGraph and VertexListGraph concepts. 

The IncidenceGraph concept introduces the capability to access out-edges of a vertex. The 
access is provided by the out _edge .iterator associated type. The requirements for the out- 
edge iterator are slightly more than the standard Inputlterator concept and slightly less than 
the Forwardlterator concept. The out-edge iterator must allow for multiple passes but deref- 
erencing an out-edge iterator need not return a reference (for example, it may return by-value 
instead). Thus we define the following new concept to express these requirements. 

concept MultiPassIterator<Iter> { 

refines Def aultConstructible<Iter> ; 
refines InputIterator<Iter> ; 

//semantic requirement: allow multiple passes through the range 

>: 

In Figure 22, the IncidenceGraph concept uses same-type constraints to require that the value 
type of the iterator to be the same type as the edge_descriptor. The VertexListGraph 
concepts adds the capability of traversing all the vertices in the graph using the associated 
vertex_iterator. 

Figure 23 shows the implementation of a graph in terms of a vector of singly-linked lists. 
Vertex descriptors are integers and edge descriptors are pairs of integers. The out-edge iterator 
is implemented with the vg_out_edge_iter class whose implementation is shown in Figure 24. 
The basic idea behind this iterator is to provide a different view of the list of target vertices, 
making it appear as a list of source-target pairs. 

The property map concepts are defined in Figure 25. The ReadWritePropertyMap is a re- 
finement of the ReadablePropertyMap concept, which requires the get function, and the 
WritablePropertyMap concept, which requires the put function. Both of these concepts re- 
fine the PropertyMap concept which includes the associated key and value types. 

Figure 26 shows the definition of the BFSVisitor concept. This concept is naturally expressed 
as a multi-parameter concept because the visitor and graph types are independent: a particular 
visitor may be used with many different concrete graph types and vice versa. The use of ref ines 
for Graph in BFSVisitor is somewhat odd, require would be more natural, but the refinement 
provides direct (and convenient) access to the vertex and edge descriptor types. An alternative 
would be use to require and some type aliases, but type aliases have not yet been added to 
concept definitions. 

Figure 27 presents an example use of the breadth_f irst_search function to output vertices 
in breadth-first order. To do so, the test_vis visitor overrides the function discover_vertex; 
empty implementations of the other visitor functions are provided by def ault_bf s_visitor. 
A graph is constructed using the AdjacencyList class, and then breadth_f irst_search is 
called. 
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Fig. 22. Graph concepts in Q. 



concept Graph<G> { 

type vertex_descriptor ; 

require Def aultConstructible<vertex_descriptor> ; 

require Regular<vertex_descriptor> ; 

require EqualityComparable<vertex_descriptor> ; 

type edge_descriptor ; 

require Def aultConstructible<edge_descriptor> ; 

require Regular<edge_descriptor> ; 

require EqualityComparable<edge_descriptor> ; 

fun source (edge_descriptor , G) -> vertex_descriptor@ ; 

fun target (edge_descriptor, G) -> vertex_descriptor@ ; 

>: 

concept IncidenceGraph<G> { 
refines Graph<G>; 

type out_edge_iterator ; 

require MultiPassIterator<out_edge_iterator> ; 
edge_descriptor == InputIterator<out_edge_iterator> .value ; 

fun out_edges(vertex_descriptor, G) 

-> pair<out_edge_iterator , out_edge_iterator>@; 
fun out_degree(vertex_descriptor, G) -> int@; 

>; 

concept VertexListGraph<G> { 
refines Graph<G>; 

type vertex_iterator ; 

require MultiPassIterator<vertex_iterator> ; 
vertex_descriptor == InputIterator<vertex_iterator> .value ; 

fun vertices (G) -> pair<vertex_iterator , vertex_iterator>@; 
fun num_vertices (G) -> int@; 

>; 



6. Related Work 

There is a long history of programming language support for polymorphism, dating back to the 
1970s [20, 57, 58, 59]. An early precursor to Q's concept feature can be seen in CLU's type set 
feature [58]. Type sets differ from concepts in that they rely on structural conformance whereas 
concepts use nominal conformance established by a model definition. Also, Q provides a means 
for composing concepts via refinement whereas CLU does not provide a means for composing 
type sets. Finally, CLU does not provide support for associated types. 

In mathematics, the notion of algebraic structure is equivalent to Q's concept, and has been in 
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Fig. 23. Implementation of a graph with a vector of lists. 



I 

fun source (pair<int , int> e, vector< slist<int> >) -> int@ 

{ return e. first; } 
fun target (pair<int , int> e, vector< slist<int> >) -> int@ 

{ return e. second; } 

model Graph< vector< slist<int> > > { 
type vertex_descriptor = int ; 
type edge_descriptor = pair<int , int> ; 

>: 

fun out_edges (int src, vector< slist<int> > G) 

-> pair<vg_out_edge_iter , vg_out_edge_iter>@ { 
return make_pair(@vg_out_edge_iter(src, begin(G [src] ) ) , 
@vg_out_edge_iter (src , end(G [src] ) ) ) ; 

} 

fun out_degree (int src, vector< slist<int> > G) -> int@ 
{. return size (G [src] ) ; } 

model IncidenceGraph< vector< slist<int> > > { 
type out_edge_iterator = vg_out_edge_iter ; 

}; 

fun vertices (vector< slist<int> > G) 
-> pair<counting_iter , counting_iter>@ 

{ return make_pair (@counting_iter (0) , @counting_iter (size (G) ) ) ; } 
fun num_vertices (vector< slist<int> > G) -> int@ { return size(G); } 

model VertexListGraph< vector< slist<int> > > { 
type vertices_size_type = int; 
type vertex_iterator = counting_iter ; 

>; 

i 



use for a very long time [60] . 

Type classes The concept feature in Q is heavily influenced by the type class feature of 
Haskell [61], with its nominal conformance and explicit model definitions. However, Q's sup- 
port for associated types, same type constraints, and concept-based overloading is novel. Also, 
Q's, type system is fundamentally different from Haskell's: it is based on System F [20, 57] in- 
stead of Hindley-Milner type inference [59]. This difference has some repercussions. In Q there 
is more control over the scope of concept operations because where clauses introduce concept 
operations into the scope of the body. This difference allows Haskell to infer type requirements 
but induces the restriction that two type classes in the same module may not have operations with 
the same name. A difference we discussed in Section 3.6.1 is that in Q, overlapping models may 
coexist in separate scopes but still be used in the same program, whereas in Haskell overlapping 
models may not be used in the same program. Haskell performed quite well in our compara- 
tive study of support for generic programming [17]. However, we pointed out that Haskell was 
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Fig. 24. Out-edge iterator for the vector of lists. 



I 

class vg_out_edge_iter { 
vg_out_edge_iter () { } 

vg_out_edge_iter (int src, slist_iterator<int> iter) 

: src(src), iter(iter) { } 
vg_out_edge_iter (vg_out_edge_iter x) 

: iter(x.iter) , src(x.src) { } 
slist_iterator<int> iter; 
int src ; 

>; 

fun operator=(vg_out_edge_iter ! me, vg_out_edge_iter other) 
-> vg_out_edge_iter ! 

{ me. iter = other. iter; me . src = other. src; return me; } 
model Def aultConstructible<vg_out_edge_iter> { }; 
model Regular<vg_out_edge_iter> { }; 

fun operator==(vg_out_edge_iter x, vg_out_edge_iter y) -> bool® 

{ return x.iter == y.iter; } 
fun operator !=(vg_out_edge_iter x, vg_out_edge_iter y) -> bool® 

{ return x.iter != y.iter; } 
model EqualityComparable<vg_out_edge_iter> { }; 

fun operator* (vg_out_edge_iter x) -> pair<int , int>@ 

{ return make_pair(x.src, *x.iter); } 
fun operator++(vg_out_edge_iter ! x) -> vg_out_edge_iter ! 

{ ++x.iter; return x; } 
model InputIterator<vg_out_edge_iter> { 

type value = pair<int , int> ; 

type difference = ptrdiff_t; 

>; ' 

model MultiPassIterator<vg_out_edge_iter> { }; 

i 



missing support for associated types and work to remedy this has been reported in [22, 23]. 

Wehr, Lammel, and Thiemann[62] have proposed extending Java with generalized interfaces, 
which bear a close resemblance to CTs concepts and Haskell's type classes, but add the capability 
of run-time dispatch using existential quantification. (Q currently provides only universal quan- 
tification, although programmers can workaround this limitation with an tricky encoding [63]). 

Signatures and functors A rough analogy can be made between SML signatures [36] and 
Q concepts, and between ML structures and Q models. However, there are significant differ- 
ences. Functors are module-level constructs and therefore provide a more coarse-grained mecha- 
nism for parameterization than do generic functions. More importantly, functors require explicit 
instantiation with a structure, thereby making their use more heavyweight than generic func- 
tions in F G , which perform automatic lookup of the required model or instance. The associated 
types and same-type constraints of Q are roughly equivalent to types nested in ML signatures 
and to type sharing respectively. We reuse some implementation techniques from ML such as a 
union/find-based algorithm for deciding type equality [64]. There are numerous other languages 
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Fig. 25. Property map concepts in Q. 



concept PropertyMap<Map> { 
type key; 
type value ; 

>; 

concept ReadablePropertyMap<Map> { 
refines PropertyMap<Map> ; 
fun get (Map, key) -> value; 

>; 

concept WritablePropertyMap<Map> { 
refines PropertyMap<Map> ; 
fun put (Map, key, value); 

>; 

concept ReadWritePropertyMap<Map> { 
refines ReadablePropertyMap<Map> ; 
refines WritablePropertyMap<Map> ; 

>; 



Fig. 26. Breadth-first search visitor concept. 

I 

concept BFSVisitor<Vis, G> { 
refines Regular<Vis> ; 
refines Graph<G>; 

fun initialize_vertex(Vis v, vertex_descriptor d, G g) {} 
fun discover_vertex(Vis v, vertex_descriptor d, G g) O 
fun examine_vertex(Vis v, vertex_descriptor d, G g) -Q 
fun examine_edge (Vis v, edge_descriptor d, G g) -Q 
fun tree_edge(Vis v, edge_descriptor d, G g) {} 
fun non_tree_edge(Vis v, edge_descriptor d, G g) O 
fun gray _target (Vis v, edge_descriptor d, G g) {} 
fun black_target (Vis v, edge_descriptor d, G g) -Q 
fun f inish_vertex(Vis v, vertex_descriptor d, G g) {} 

>; 

i 



with parameterized modules [65, 66, 67] that require explicit instantiation with a structure. 

Recently, Dreyer, Harper, Chakravarty, and Keller proposed an extension of SML signatures/- 
functors, call modular type classes [68], that provides many of the benefits of Haskell type classes 
such as implicit instantiation and instance passing. The design for modular type classes differs 
from concepts in Q primarily in that it supports the convenience of type inference at the price of 
disallowing overlapping instances in a given scope and first-class polymorphism. 

Subtype-bound polymorphism Less closely related to Q are languages based on subtype- 
bounded polymorphism [69] such as Java, C#, and Eiffel. We found subtype-bounded poly- 
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Fig. 27. Example use of the BFS generic function. 



I 

struct test_vis { }; 

fun discover_vertex<G> (test_vis , int v, G g) { printf("7„d ", v) ; } 

model <G> where { Graph<G>, Graph<G> . vertex_descriptor == int } 
BFSVisitor<test_vis, G> { }; 

fun mainO -> int@ { 
let n = 7; 

let g = @vector< slist<int> >(n); 
push_front(l, g[0]); push_front(4, g[0]); 
push_front(2, g[l]); push_front(3, g[l]); 
push_front(4, g[3]); push_f ront (6 , g[3]); 
push_front(5, g[4]); 

let src = 0; 

let color = new Color [n] ; 
for (let i = 0; i != n; ++i) 

color [i] = white; 
breadth_f irst_search(g, src, color, @test_vis () ) ; 
return ; 

} 



morphism less suitable for generic programming and refer the reader to [17] for an in-depth 
discussion. 

Row variable polymorphism OCaml's object types[37, 70] and polymorphism over row vari- 
ables provide fairly good support for generic programming. However, OCaml lacks support for 
associated types so it suffers from clutter due to extra type parameters in generic functions. 
PolyTOIL [71], with its match-bound polymorphism, provides similar support for generic pro- 
gramming as OCaml but also lacks associated types. 

Virtual types One of the proposed solutions for dealing with binary methods and associated 
types in object-oriented languages is virtual types, that is, the nesting of abstract types in inter- 
faces and type definitions within classes or objects. The beginning of this line of research was the 
virtual patterns feature of the BETA language [72]. Patterns are a generalization of classes, ob- 
jects, and procedures. An adaptation of virtual patterns to object-oriented classes, called virtual 
classes, was created by Madsen and Moller-Pedersen [73] and an adaptation for Java was created 
by Thorup [74]. These early designs for virtual types were not statically type safe, but relied on 
dynamic type checking. However, a statically type safe version was created by Torgersen [75]. 
A statically type safe version of BETA'S virtual patterns was developed for the gbeta language of 
Ernst [76, 77]; the Scala programming language also includes type safe virtual types [78, 79]. 

7. Conclusion 

This article presents a new programming language named Q that is designed to meet the needs 
of large-scale generic libraries. We demonstrated this with an implementation of the Standard 



45 



Template Library (STL) and the Boost Graph Library (BGL). We were able to implement all of 
the abstractions in the STL and BGL in a straightforward manner. Further, Q is particularly well- 
suited for the development of reusable components due to its support of modular type checking 
and separate compilation. Q's strong type system provides support for the independent validation 
of components and Q's, system of concepts and constraints allows for rich interactions between 
components without sacrificing encapsulation. The language features present in Q promise to in- 
crease programmer productivity with respect to the development and use of generic components. 
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